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The  proposed  electro-optical  image  algebra  processing  system  is  designed 
specifically  for  image  processing  and  other  related  computations.   The  design  is  a 
hybridization  of  an  optical  correlator  and  a  massively  paralleled,  single  instruction 
multiple  data,  processor.  The  architecture  of  the  design  consists  of  three  tightly 
coupled  components:  a  spatial  configuration  processor  (the  optical  analog 
portion),  a  weighting  processor  (digital),  and  an  accumulation  processor  (digital). 
The  systolic  flow  of  data  and  image  processing  operations  are  directed  by  a  control 
buffer  and  pipelined  to  each  of  the  three  processing  components.  The  image 
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processing  operations  are  defined  in  terms  of  basic  operations  of  an  image  algebra 
developed  by  the  University  of  Florida.  The  algebra  is  capable  of  describing  all 
common  image-to-image  transformations.   The  merit  of  this  architectural  design 
is  how  it  implements  the  natural  decomposition  of  algebraic  functions  into 
spatially  distributed,  point  wise  operations.  The  effect  of  this  particular 
decomposition  allows  convolution  type  operations  to  be  computed  strictly  as  a 
function  of  the  number  of  elements  in  the  template  (mask,  filter,  etc.)  instead  of 
the  number  of  picture  elements  in  the  image.   Thus,  a  substantial  increase  in 
throughput  is  realized. 

The  implementation  of  the  proposed  design  may  be  accomplished  in  many 
ways.  While  a  hybrid  electro -optical  implementation  is  of  primary  interest,  the 
benefits  and  design  issues  of  an  all  digital  implementation  are  also  discussed.  The 
potential  utility  of  this  architectural  design  lies  in  its  ability  to  control  a  large 
variety  of  the  arithmetic  and  logic  operations  of  the  image  algebra's  generalized 
matrix  product.   The  generalized  matrix  product  is  the  most  powerful  fundamental 
operation  in  the  algebra,  thus  allowing  a  wide  range  of  applications.  No  other 
known  device  or  design  has  made  this  claim  of  processing  speed  and  general 
implementation  of  a  heterogeneous  image  algebra. 
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CHAPTER  1 
INTRODUCTION 


The  objective  of  this  dissertation  is  to  define  the  logical  configuration  of  a 
hybrid  electro-optical  system  designed  specifically  for  image  processing.  The 
architectural  design  of  the  system  integrates  the  operation  of  an  optical  correlator 
with  a  massively  parallel  digital  processor.  This  combined  system  is  designed  to 
optimize  the  operations  and  functionality  of  an  image  algebra  developed  by  the 
University  of  Florida  under  Air  Force  contract  F08635-84-C-0295  [61].  It  is  in 
the  further  interest  of  the  Air  Force  that  the  proper  implementation  of  this 
concept  may  be  useful  for  advanced  guidance  systems  that  provide  Automatic 
Target  Recognition  (ATR).  This  issue  is  also  addressed  in  this  dissertation. 

There  are  many  processor  architectures  which  have  been  designed  for  image 
processing  applications  [12,13,17,29,34,40].  In  general,  such  applications  have  been 
implemented  as  a  limited  set  of  specific  processing  routines.   All  image-to-image 
transformations  can  be  fully  described  by  the  unary,  binary,  and  matrix  operations 
defined  in  the  aforementioned  image  algebra.   No  prior  architectural  design  has 
ever  been  developed  to  directly  support  the  functionality  of  the  image  algebra. 

The  low -level  abstraction  of  the  proposed  computational  system  is  founded 
on  the  mathematical  principle  of  discrete  convolutions  and  their  geometrical 
decompositions.   The  geometric  decompositions  combined  with  single  instruction 
multiple  data  (SIMD)  array  processing  requires  redefining  specific  algebraic 
operations  and  optimizing  their  order  of  execution.  The  logical  data  flow  of  such 


an  abstraction  leads  to  a  division  of  operations,  those  defined  by  pointwise 
operations  and  those  defined  in  terms  of  neighborhood  operations. 

The  intended  use  of  the  electro-optical  image  processing  system  proposed  in 
this  dissertation  is  that  of  an  integral  part  of  the  advanced  guidance  system  for  an 
airborne  autonomous  vehicle.  The  sensor  for  this  type  of  guidance  system  can  be  a 
synthetic  aperture  radar  or  a  two-  or  three-dimensional  infrared  imaging  device. 
The  proposed  system  provides  a  common  modular  hardware  solution  to  a  wide 
class  of  problems  associated  with  these  sensing  devices. 

Considering  other  applications,  the  implementation  can  range  in  use  from 
the  embedded  form  described  in  this  dissertation  to  a  general  use  coprocessor 
configuration  coupled  with  an  image  processing  workstation  [60].   Since  the  design 
involves  such  a  comprehensive  set  of  operations,  it  is  reasonable  to  assume  that  a 
general  use  implementation  would  have  a  wide  range  of  commercial  applications 
such  as  medical  imaging,  manufacturing  robotics,  etc.,  as  well  as  ATR. 

1.1  The  Requirement 

The  1991  war  in  the  Persian  Gulf  was  the  first  war  in  which  smart  weapons 
were  used.  The  term  "smart  weapon"  was  used  mostly  in  the  context  of  pilot  cued 
contrast  tracking.  The  inertially  navigated  unmanned  cruise  missile  was  as  close 
to  autonomous  guidance  as  the  state-of-the-art  would  allow. 

As  a  result  of  the  success  and  limitations  observed  in  this  generation  of 
smart  weapons,  the  projection  for  the  next  generation  is  that  we  will  have  more 
intelligent  systems  that  require  positive  target  identification  and  quick  response  in 
a  high  electronic  counter  measure,  low -observable  environment.  This  high  threat 
situation 


demands  a  reduction  in  the  need  for  pilot  cuing,  preferably  a  complete  stand-off, 
launch  and  leave  capability. 

The  real-time  computational  rates  for  these  next  generation  systems  are 
established  to  be  well  into  the  gigaops,  or  109  operations  per  second,  range.  In 
order  to  realize  the  necessary  rates,  technological  advances  must  be  made  using 
massively  parallel  computational  methods. 

1.2   The  ATR  Problem 

With  respect  to  any  weapon  delivery  system,  the  need  for  target  recognition 
is  fundamental  for  correct  system  performance.   The  rationale  for  automatic  target 
recognition  impacts  not  only  the  system's  performance  but  the  tactics  used  in 
delivering  the  weapon;  i.e.,  pilot  acquisition  not  required,  launch  and  leave. 

A  conventional,  blast -fragmentation  type  munition  is  typically  designed  to 
be  effective  against  an  array  of  different  targets.   Another  important  advantage  of 
recognizing  the  target  a  priori  is  knowing  when  and  how  to  detonate  the  munition. 
If  the  size  of  an  arbitrary  target  and  the  relative  location  of  its  most  vulnerable 
area  are  known,  then  the  munition  can  be  detonated  in  an  optimal  sequence. 
Optimally,  the  warhead  explosion  is  timed  to  affect  the  fragmentation  distribution 
(characterizing  velocity,  residual  mass,  and  direction),  to  maximize  the  kill 
potential.   Such  a  level  of  sophistication  in  a  weapon  system  can  only  be 
accomplished  as  an  automated  sub-system  function. 

The  automatic  recognition  and  subsequent  track  point  location  of  an  object 
that  is  contained  in  a  digital  image  sequence  is  accomplished  in  an  interrelated 
series  of  programmed  events  executed  per  image  frame.  These  events  have  been 
formally  defined  by  the  Automatic  Target  Recognizer  Working  Group  (ATRWG). 


The  ATRWG  is  a  joint  U.S.  Department  of  Defense/Industry  working  group 
sponsored  by  the  Defense  Advanced  Research  Projects  Agency  [7]. 

A  depiction  of  their  interrelationships  taken  from  ATRWG  report  number 
87-002  is  shown  in  Figure  1.   A  summary  of  the  ATRWG  definitions  is  as  follows: 

(1)  Preprocessor:  signal  conditioning,  image  enhancement,  and/or  image 
transformation.   The  primary  objective  of  this  procedure  is  to  condition  or  convert 
the  raw  image  data  for  the  next  event,  usually  the  points  of  interest  locator.   Noise 
reduction  filtering,  dynamic  range  of  image  intensity  via  histogram  modification, 
and  linear/nonlinear  transforms  that  globally  modify  the  high  spatial  frequencies 
of  an  image  are  some  of  the  techniques  that  are  typically  used. 

(2)  Interest  Point  Locator:  geometrical  or  statistical.  This  procedure  is  a 
basic  operation  that  identifies  locations  within  the  image  that  are  potential  object 
points.   Statistical  data  differences  or  values  computed  globally  may  be  used  to 
isolate  points,  or  specific  geometric  shapes  can  be  isolated,  automatically,  using 
template  matching  techniques. 

(3)  Segmentor:  edge  based,  region  based,  or  texture  based.  The  objective 
of  this  procedure  is  to  partition  the  image  into  regions  that  are  based  on  consistent 
information.   The  information  is  normally  a  refinement  of  the  input  to  the  Feature 
Extractor. 

(4)  Feature  Extractor:  geometrical  or  statistical.  The  computation  is  a 
fine  grain  identification  procedure  similar  to  that  of  the  Interest  Point  Locator. 
Specific  attributes,  or  feature  values,  are  determined  for  each  segmented  region 
and  used  to  form  a  corresponding  feature  vector.  This  syntactic  event  forms  the 
input  for  the  Classifier  whose  output  is  a  semantic  event. 

(5)  Classifier:   statistical  or  syntactic.  In  some  applications,  distinct 
distributions  can  be  determined  for  classes  of  feature  vectors.  Statistical  methods 


are  then  used  to  associate,  with  a  level  of  confidence  for  objects  of  interest,  the 
feature  vectors  that  are  output  from  the  Feature  Extractor  with  the  appropriate 
distributions  or  classes.   Other  applications  can  use  subcomponents  or  pattern 
primitives  that,  in  a  syntactic  sense,  relate  regions  to  classes.  The  syntactic 
method  is  favored  for  its  execution  speed;  however,  it  is  highly  dependent  upon 
system  resolution.   Both  methods  output  the  necessary  class  assignment  or  object 
labelling. 

(6)  Tracker:  pixel-to-pixel  or  object-to-object.  Depending  on  the 
sophistication  of  the  system,  the  tracker  can  be  driven  by  pixel  information  (e.g.,  if 
the  input  to  the  Tracker  is  taken  directly  from  either  the  Preprocessor  or  the 
Interest  Point  Locator)  or  by  object  precedence  (e.g.,  the  input  comes  from  the 
Feature  Extractor  or  Classifier).  Both  tracking  methods  are  tightly  coupled  to  a 
system's  guidance  law,  typically  a  rigid-body  response  feedback  loop  designed  to 
minimize  a  tracking  error  solution.   Note,  an  optical  correlator  tracking  system  is 
an  object-to-object  tracker  that  receives  its  input  directly  from  the  Preprocessor. 

The  ATR  problem  arises  when  the  rate  at  which  the  system  must  respond 
to  changes  in  kinematic  conditions,  called  the  update  rate,  is  higher  than  the 
information  processing  rate.  When  this  situation  occurs,  the  tracking  error  cannot 
be  minimized  and  in  the  worst  case  the  system  loses  its  ability  to  track.  For  this 
reason,  streamlined  information  processing,  current  weapon  systems  are  reduced  to 
some  form  of  pilot  cuing  coupled  with  contrast  tracking. 

1.3  Architectural  Considerations 

In  order  to  solve  the  ATR  problem,  the  next  generation  of  weapon  system 
guidance  must  incorporate  computational  methods  which  are  capable  of  processing 
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Figure  1.  ATR  Component  Structure,  Input/Output  Diagram. 


information  at  rates  that  are  several  orders  of  magnitude  higher  than  the 
respective  update  rate.  The  solution  is  a  time  complexity  issue. 

The  mathematical  notation  used  in  this  dissertation  to  evaluate  time 
complexity  is  the   O-notation  [1].  The  following  is  a  brief  review  of  the 
significance  of  the  notation. 

For  any  arbitrary  function  fin)  the  asymptotic  upper  bound  is  defined  in 
the   O-notation  as  j(n)  =  0(9(n))i  ^  and  only  if  there  exist  two  positive  constants 
c  and  n0  such  that    |Xn)l  ^c  l#(n)l    for  all  n  >  n0.   The  most  common 
references  for  computing  times  are 

0(1)  <  0(logn)  <  0(n)  <  0(nlogn)  <  0(n2)  <  0(n3)  <  ■  •  •<  0(2n). 

Whereas   O-notation  is  used  to  express  an  upper  bound,  fl(n)  is  used  to 
express  a  lower  bound.  It  is  defined  as:  j(n)  =  fl(p(n)),  if  and  only  if  there  exist 
two  positive  constants  c  and  n0  such  that    |Xn)l  >c  l<Kn)l    for  all  n  >  n0. 

If  g(n)  is  both  an  upper  and  lower  bound  on  j{v),  then  8  is  given  as  the 
notation.  Formally,  j{n)  =  B(#(n)),  if  and  only  if  there  exist  positive  constants  c^ 
c2,  and  n0  such  that  for  all  n  >  n0,  ct  |#(n)|  <  |Xn)|  <  c2  |</(n)|. 

The  question  that  arises  is,  how  can  application  algorithms,  as  summarized 
above  from  the  ATRWG,  be  processed  with  time  complexity   0(n),   0(log  n),  or 
even   0(1)?  The  approach  taken  in  this  dissertation  is  to  couple  the  benefit  of 
massively  parallel,  digital  computation  with  the  benefit  of  optical  computation  to 
provide  a  lowest  upper  bound. 

The  most  important  consideration  of  any  algorithm  is  its  mathematical 
structure.  Two  questions  associated  with  the  mathematical  structure  are: 
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(1)  Is  the  mathematical  structure  optimized  for  time  complexity  and  the 
kind  of  parallelism  provided  by  the  computer  in  use? 

(2)  What  type  of  architecture  provides  the  necessary  execution  speed  on 
the  basic  operations  specified  by  the  mathematical  structure? 

The  architectural  design  proposed  in  this  dissertation  provides  the 
necessary  speed  at  the  operator  level  of  the  algebraic  abstraction.  This  allows  the 
algorithm  designer  to  be  creative  at  the  conceptual  level  without  first  knowing  how 
the  parallel  operations  are  physically  mapped  to  the  underlying  architecture. 

Many  application  algorithms  involving  image  analysis  and  processing  are 
central  processing  unit  (CPU)  or  input /ouput  (I/O)  bound,  or  both;  i.e.,  an 
algorithm  may  be  less  robust  due  to  limitations  on  processor  or  data  bandwidths. 
Even  considering  the  hardware  speedup  due  to  developments  in  very  large  scale 
integrated  (VLSI)  components  and  complementary  metal-oxide  semiconductor 
(CMOS)  technology,  many  algorithm  tasks  still  require  throughputs  that  are 
beyond  current  capabilities.   Parallel  computer  architectures  have  the  capability  to 
overcome  many  of  these  bandwidth  limitations;  however,  the  communication 
among  processors  is  a  physical  constraint  that  is  not  easily  or  inexpensively 
overcome. 

Optical  systems,  on  the  other  hand,  have  such  a  high  space-bandwidth 
product  that  these  limitations  appear  insignificant.  In  the  past,  the  inherent 
inaccuracies  of  these  analog  systems  have  limited  the  use  of  such  devices  almost 
exclusively  to  correlation  operations.  The  interest  in  these  correlator  operations  is 
mainly  the  magnitude  and  location  of  the  cross -correlation  of  an  image  scene  with 
a  reference  subimage  (usually  an  object  to  be  identified  in  the  input  image  scene). 


A  large  signal  to  noise  ratio  can  be  tolerated  in  such  a  system  and  still  provide 
acceptable  autocorrelation  position  information. 

Recent  advances  in  spatial  light  modulator  design  have  reduced  optical 
system  inaccuracies  and  increased  their  switching  speeds  to  the  point  where  they 
are  now  viable  for  the  hybrid  system  described  in  this  dissertation  [18].  The 
approach  taken  in  this  dissertation  does  use  the  matched  filter/correlator  principle. 
However,  it  is  simply  taking  advantage  of  the  high  space-bandwidth  product  of  the 
optics  to  provide  a  fundamental  convolution  procedure  which  quickly  and 
efficiently  translates  the  input  image  while  minimizing  inter-processor 
communication. 

The  benefit  from  coupling  the  fine  grain,  SIMD  computational  system  with 
the  optical  correlator  can  be  summarized  by 

(1)  The  I/O  for  any  electrically  connected  SIMD  design  is  a  major 
bottleneck  in  terms  of  system  throughput.   Most  implementations  of  these  designs 
that  are  directed  towards  image  processing  reduce  the  effect  of  this  condition  by 
fixing  the  size  of  template,  or  convolution  type,  operations  over  images  so  that  an 
I/O  pipeline  can  be  conveniently  configured.   By  allowing  the  optical  analog  to 
perform  the  necessary  image  translations  in  free-space,  this  restriction  is 
eliminated.  Thus,  any  template  configuration  can  be  algebraically  processed  at  a 
cost  proportional  to  the  number  of  elements  in  that  template. 

(2)  The  free-space  distribution,  combined  with  the  algebraic  decomposition 
provided  in  this  dissertation,  is  a  simple  solution  to  an  otherwise  complex 
inter-processor  communication  control  problem. 

(3)  The  complete  arithmetic  and  logical  control  the  image  algebra  has  over 
the  Preprocessor  event,  as  defined  by  ATRWG,  provides  a  more  efficient  and 
flexible  optical  correlation  system,  a  proposed  solution  to  the  ATR  problem. 


10 

(4)  The  method  of  sorting,  defined  in  Chapter  5,  while  generating  the  ith 
rank-order  for  order  statistic  filtering  is  unique.  It  removes  the  combinatorial 
constraint  incurred  with  the  use  of  large  template  configurations  and  either  stack 
filter  or  morphological  operations.  The  order  statistic  filtering  operation  using  the 
proposed  design  is  executed  at  essentially  the  same  cost  function  as  a  single 
convolution  operation  with  the  same  template  configuration. 

(5)  The  insight  gained  from  the  development  of  this  hybrid  optical  design 
will  lead  to  further  research  into  an  all  digital  design  for  more  general  applications. 
Also,  a  coupling  of  the  SIMD  approach  taken  here  with  a  holographic,  free-space 
information  distribution  method,  sometimes  referred  to  as  an  optical  crossbar 
switch,  will  provide  an  even  greater  reduction  in  the  switching  time  among 
processes. 

1.4  Outline  of  the  Dissertation 

A  historical  perspective  on  the  development  of  the  image  algebra  by  the 
University  of  Florida  and  its  relationship,  similarities  and  differences,  to  other 
image  algebras  (most  notably  binary  image  algebra  [41])  is  the  discussion  of 
Chapter  2.   Chapters  3  and  4  cover  the  conceptual  descriptions  of  two  hybrid 
optical  configurations.  The  reason  for  presenting  two  separate  configurations  is 
to  show  applicability  to  a  high  resolution,  holographic  film  implementation  as  well 
as  a  solution  to  the  ATR  problem.   The  first  concept  is  a  programmable  spatial 
light  modulator,  optical  correlator  implementation.  This  configuration  offers 
greater  flexibility  but  at  lower  resolution,  which  is  adequate  for  a  solution  to  the 
ATR  problem.   The  second  design  configuration  is  based  on  a  Vander  Lugt  optical 
correlation  principle  which  uses  preset  templates  resulting  in  a  different  approach 
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to  the  image  algebra  integration.   Chapter  5  presents  a  discussion  on  the  use  of 
order  statistic  filters,  with  respect  to  the  ATR  problem,  and  their  integration  with 
the  design  configuration  proposed  in  Chapter  3.   Other  methods  of  generating  the 
order  statistics  are  investigated  and  compared  to  a  proposed  method  that  operates 
at  or  nearly  the  same  speed  as  a  single  convolution.   Finally,  Chapter  6  concludes 
the  dissertation  with  a  discussion  on  the  use  of  these  findings  for  further  research. 


CHAPTER  2 
BACKGROUND 


Research  in  the  field  of  optical  computing  has  increased  considerably  over 
the  last  decade,  primarily  due  to  the  inherent  parallelism  provided  by  optical 
devices.   The  alluring  idea  of  complex  computations  performed  literally  at  the 
speed  of  light  has  been  perceived  as  a  panacea  for  a  wide  range  of  problems. 

To  give  a  quantitative  example  of  the  speed  advantage,  consider  the  upper 
bound  on  the  speed  of  electronic  devices  to  be  on  the  order  of  picoseconds  (10~12 
seconds);  optical  devices  are  easily  three  orders  of  magnitude  higher,  femtoseconds 
(10"15  seconds)  [5].  Parallelism  in  the  optical  device  is  quantified  by  the 
space— bandwidth  product  (SW)  of  the  optical  system.  The  SW  is  equivalent  to 
the  maximum  number  of  spatial  degrees  of  freedom  which  is  directly  proportional 
to  the  product  of  the  areas  of  the  input  plane  and  the  spatial  frequency  plane  and 
inversely  proportional  to  the  square  of  the  product  of  the  wavelength  of  the  light 
and  the  focal  length  of  the  system.  More  formally, 

SW  =  a  A  /  (Af)2, 

where  A  is  the  wavelength  of  the  light,  f  is  the  focal  length  of  the  system,  a  is 
the  area  of  the  input  image,  and  A  is  the  area  of  the  spatial  frequency  plane  [5]. 
Considering  the  digital  storage  required  for  the  equivalent  space-bandwidth 
product,  it  is  clear  that  the  optical  device  is  superior.  It  is  for  these  reasons  that 
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the  Air  Force  devotes  considerable  research  funds  to  the  further  development  of 
optical  computing  methods  that  provide  solutions  to  the  ATR  problem. 

The  optical  method  reported  in  this  dissertation  as  a  solution  to  the  ATR 
problem  is  a  hybrid  of  a  massively  parallel  digital  system  and  an  optical  correlator. 
The  reasons  for  selecting  this  particular  optical  system  are  the  maturity  of  the 
technology  and  the  precision  of  the  target  identification  process.   Also,  proprietary 
sources  have  demonstrated  the  capability  of  storing  many  thousands  of  frequency 
transformed  reference  images  in  a  crystal  lattice  composition.   The  crystal  is  then 
installed  in  the  correlator  as  a  multiple  reference  matched  filter. 

The  preprocessing  step  is  critical  for  the  proper  operation  of  an  optical 
correlator  with  a  multiple  reference  matched  filter.   The  raw  input  image  may, 
according  to  the  particular  application,  need  to  be  dynamically  enhanced,  filtered, 
rotated,  scaled,  and/or  thresholded.   Operations  such  as  these  are  necessary  to 
ensure  the  highest  probability  of  autocorrelation  for  the  system;  i.e.,  the  highest 
signal  to  noise  ratio.  They  also  need  to  be  generally  programmable  and  executable 
without  significantly  degrading  the  speed  of  the  system. 

Massively  parallel  computation  at  nearly  the  speed  of  light  is  a  reality. 
What  remains  is  the  proper  interface  to  programmatically  control  the  application. 
This  dissertation  presents  that  interface  and  its  relationship  to  an  algebra  that  is 
specifically  designed  to  express  image  processing  operations. 

2.1  Image  Algebra  Overview 

In  recent  years,  several  image  algebras  have  been  developed  [29,41,67,69]. 
These  algebras  are  for  the  most  part  homogeneous,  having  a  finite  number  of 
operations  for  transforming  one  or  more  elements  of  a  single  set  into  another 
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element  of  the  same  set.  The  mathematics  of  image  processing,  in  the  most 
general  sense,  forms  a  heterogeneous  algebra  because  it  involves  many  types  of 
operations  on  and  between  elements  of  different  sets.  If  an  algebra  A  properly 
contains  subalgebras  that  are  isomorphic  to  some  other  algebras,  then  A  is  more 
•powerful  than  these  algebras;  i.e.,  A  contains  all  the  laws  and  rules  of  these 
algebras  plus  more  [62].  The  algebra  we  are  about  to  discuss  is  a  heterogeneous 
algebra  specifically  designed  for  image  processing  and  is  more  powerful  than  other 
image  processing  algebras,  such  as  mathematical  morphology,  currently  discussed 
in  the  literature  in  that  it  contains  these  algebras  as  isomorphic  subalgebras  [22]. 

In  1984  Ritter  et  al.,  under  joint  sponsorship  of  the  Air  Force  Armament 
Laboratory  (AFATL)  and  the  Defense  Advanced  Research  Projects  Agency,  began 
the  development  of  a  unifying  mathematical  theory  for  concepts  and  operations 
encountered  in  image  and  signal  processing.  The  goal  was  to  develop  a  complete, 
unified  algebraic  structure  that  would  provide  a  common  mathematical 
environment  for  image  processing  algorithm  development,  optimization, 
comparison,  coding,  and  performance  evaluation  [57].   The  outgrowth  of  this  effort 
was  the  AFATL  image  algebra,  which  we  shall  simply  refer  to  as  the  image 
algebra. 

The  image  algebra  is  a  heterogeneous  algebra  capable  of  expressing  all 
image  processing  transforms  in  terms  of  its  many  different  operations  over  a  family 
of  sets,  its  operands  [63].  It  manifests  itself  as  a  theory  concerned  with  the 
interrelationship  of  two  broad  mathematical  areas,  point  sets  (subsets  of 
topological  spaces)  and  value  sets  (subsets  of  algebraic  structures;  e.g.,  groups, 
rings,  vector  spaces,  lattices,  etc.).  The  algebraic  manipulation  and 
transformation  of  geometric/spatial  information  is  the  basis  for  the 
interrelationship.   The  transformation  of  geometric  data  may  result  in  geometric, 
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statistical,  or  syntactic  information  represented  by  images.   The  point  and  value 
sets  along  with  images  and  templates  form  the  basic  types  of  operands  for  the 
algebraic  structure.   Operations  on  and  between  images  are  the  natural  induced 
operations  of  the  algebraic  system.   Operations  between  images  and  templates  and 
between  templates  and  templates  are  based  on  the  definition  of  a  generalized 
product  [56]. 

Isomorphisms  have  been  proven  to  exist  between  subalgebras  of  the  image 
algebra  and  the  homogeneous  algebras  previously  mentioned  [61].   Even  more 
noteworthy,  subalgebras  of  the  image  algebra  have  been  proven  to  be  isomorphic 
with  standard  Linear  Algebra  and  Minimax  Algebra  [22,30].  Image  algebra 
operations,  both  unary  and  binary,  are  not  only  capable  of  defining  commonly  used 
low  level  transforms,  but  higher  levels  of  processing  as  well,  these  include  neural 
network  methods,  feature  space  manipulation,  and  others  [61].  A  complete 
description  of  the  image  algebra  can  be  found  in  Ritter  et  al.  [63]. 

2.1.1  Value  Sets  and  Point  Sets 

In  image  algebra,  value  sets  are  synonymous  with  homogeneous  algebras. 
For  a  definition  of  an  algebra,  refer  to  [56].   An  arbitrary  value  set  will  be  denoted 
by  the  symbol  F.   Some  of  the  more  common  examples  of  value  sets  used  in  image 
processing  are  the  sets  of  integers,  real  numbers,  complex  numbers,  binary 
numbers  of  fixed  length  k,  and  extended  real  numbers  (which  include  one  or  both 
of  the  symbols  +<d  and  -u>).  These  value  sets  are  denoted  by  I,  R,  C,  L,  R       =  R 
U  {+ao},  R      =  R  U  {— od},  and  R     =  R  ,     U  R     ,  respectively.   The  operations  on 
and  between  elements  of  a  given  value  set  F  e  {I,  R,  C,  1  k,  R  ,    ,  R     ,R^  }  are  the 

°  l  2        +oo      — oo     ±00 

usual  arithmetic  and  lattice  operations  associated  with  F  [61].  For  example,  the 
operations  on  the  elements  of  the  value  set  F,  where  F  =  R,  are  the  arithmetic  and 
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logic  operations  of  addition,  multiplication,  and  maximum,  and  the 
complementary  operations  of  subtraction,  division,  and  minimum.   The  operations 
on  subsets  of  F  are  the  operations  of  union,  intersection,  set  subtraction,  choice 
function  and  cardinality  function,  denoted  by  U,  n,  \,  choice  and  card, 
respectively.   The  choice  function  returns  an  arbitrary  element  of  the  subset  of  F 
while  the  cardinality  function  returns  the  number  of  elements  in  the  subset  of  F 
[63]. 

In  image  algebra,  the  notion  of  a  point  set  is  equivalent  with  that  of  a 
topological  space.  Point  sets  most  applicable  to  present  day  image  processing 
tasks  are  subsets  of  n-dimensional  Euclidean  space  K  .  These  point  sets  provide 
the  flexibility  of  defining  rectangular,  hexagonal,  toroidal,  spherical,  and  parabolic 
discrete  arrays  as  well  as  infinite  subsets  of  Rn  [62].   The  letters  X,  Y  and  W 
denote  point  sets,  and  elements  of  point  sets  are  denoted  by  bold  lower  case  letters 
to  represent  vectors  in  K  .  In  particular,  if  x  6  X  and  X  c  IR  ,  then  x  =  (xv 
x2>"  * "  >  xn)>  wnere  Xj  €  R  V  i  =  1,  2,  •  •  •  ,n.  Image  algebra  operations  acting  on 
subsets  of  point  sets  are  the  operations  of  U,  fl,  \,  choice  and  card.  The  operations 
acting  on  or  between  elements  of  point  sets  are  the  usual  operations  between 
coordinate  points;  i.e.,  vector  addition,  scalar  and  vector  multiplication,  dot 
product,  etc.  [63]. 

2.1.2  Images 

An  image  is  the  most  fundamental  operand  in  the  image  algebra  and  is 
defined  in  terms  of  value  sets  and  point  sets.   Given  point  and  value  sets  X  and 
F,  respectively,  an  F-  valued  image  a  on  X  is  a  function  a:X — >F.   The  data 
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structure  associated  with  an  IF -valued  image  a  on  X  is  the  graph  of  the  function 
a,  which  we  again  denote  by  a;  i.e., 

a  =  {(x,a(x))  :  x  6  X},  (1) 

where  a(x)  e  F  (see  Figure  2). 

The  set  X  is  called  the  set  of  image  points  of  a  and  the  range  of  the 
function  a  (which  is  a  subset  of  F)  is  the  set  of  image  values  of  a.  The  pair 
(x,a(x))  is  called  a  picture  element  or  pixel,  where  x  is  the  pixel  location  of  the 
pixel  value  a(x).  The  set  of  all  F -valued  images  on  X  is  denoted  by  F1  [63]. 

2.1.3  Operations  on  Images 

The  fundamental  operations  on  and  between  F -valued  images  are  the 
naturally  induced  operations  of  the  algebraic  structure  of  the  value  set  F;  in 
particular,  if  F  =  R,  then  the  induced  operations  on  [R     are  the  pixel -wise  binary 
and  unary  operations  of  addition,  multiplication,  maximum,  minimum,  absolute 
value,  exponentiation,  etc.  These  pixel-wise  operations  will  be  executed  by  the 
weighting  and  accumulation  processors  of  the  proposed  architecture  [61]. 

2.1.4  Generalized  Templates 

A  template  in  Ratter's  image  algebra  is  a  mathematical  entity  that 
generalizes  the  concepts  of  templates,  masks,  windows  and  structuring  elements, 
thereby  playing  an  important  role  in  expressing  image  transformations.  Templates 
provide  a  tool  for  describing  image  operations  where  each  pixel  value  in  the  output 
image  depends  on  the  pixel  values  within  some  predefined  configuration  of  the 
input  image  [49,  63]. 
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a=  |  (x,a(x)):  xcX,  a(x)cFJ 


If  the  chosen  value  set  IF  is  equal  to  IR,  then  any  given  point  set  X 


will  define  a  real  valued  image  a  on  X,  a  :  X — ►IR  where  x 


is  the  pixel  location  and  a  (x )  is  the  pixel  value  at  x.  In  this  particular 

2  2 

illustration  we  choose  a  discrete  rectangular  point  set  X  C  Z   C  IR, 

where  Z2  =  {(U)  :i.j<  Z}- 


Figure  2.  Example  of  a  2-D  rectangular  real-valued  image. 
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Let  X  and  Y  be  point  sets  and  F  a  value  set.  A  generalized  F -valued 
template  t  from  Y  to  X  is  a  function  t :  Y  — » F  .  Thus,  a  template  is  simply 
an  image  whose  values  are  images;  i.e.,  a  template  from  Y  to  X  is  an  Fx-valued 

X   V 

image  on  Y.  If  t  6  (F  )  ,  then  for  each  y  6  Y,  t(y)  is  an  F -valued  image  on  X. 
The  set  of  all  F -valued  templates  is  denoted  by  (FX)Y.  The  sets  Y  and  X  are 
referred  to  as  the  target  domain  and  the  range  space  of  t,  respectively.  For 
notational  convenience,  the  symbol  t    is  defined  as  t    =  t(y).   The  pixel  location 
y  at  which  a  template  ty  is  evaluated  is  called  the  target  point  of  t  and  the 
values  ty(x)  are  called  the  weights  of  t  at  y  [22,63]  (see  Figure  3). 

If  t  is  a  real  or  complex  valued  template  from  Y  to  X,  then  the  support 
of  ty  is  defined  by  e/(ty)  =  {x  €  X  :  ty(x)  #  0}.  If  t  is  an  extended  real  valued 
template,  then  the  following  supports  at  infinity  are  also  defined: 

G/(ty)  =  {x6X:ty(x)^o,}, 

^(ty)  =  {ieX:ty(i)N}, 
and  ^(ty)  =  {x  €  X  :  ty(x)  i  ±(B}. 

Templates  are  classified  as  either  translation  invariant  or  translation 
variant.  Translation  invariance  is  an  important  property  because  it  allows  image 
processing  algorithms  and  transformations  that  make  use  of  invariant  templates  to 
be  easily  mapped  to  a  wide  variety  of  computer  architectures  [17,34,37,49]. 

77  77  XX 

Suppose  X  =  R     or  X  =  I  .  A  template  t  e  (F  )     is  said  to  be  translation 
invariant  if  and  only  if  for  each  triple  x,  y,  z  e  X,  t  (x)  =  t     z(x  +  z). 
Translation  invariant  templates  have  fixed  supports  and  fixed  weights  and  can  be 
represented  pictorially  in  a  concise  manner.   A  template  which  is  not  translation 
invariant  is  called  translation  variant,  or  simply  a  variant  template.  These 
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ty=   {  (x,ty(x)):   x€X,ty(x)elF} 

Let  X  and  Y  be  point  sets  and  F  a  value  set. 

A  generalized  IF  -valued  template  t  from  Y  to  X  is  a  function 

t  :  Y— *  Fx  [  denoted  t  €  (FX)Y  J.    For  each  target  point    y€  Y 

ty=  t(y) 
where  the  values  t  ( x )  are  the  weights  of  the  template  t  at  y. 

In  this  example,  Y  =  I2,  XC  Z2  is  a  rectangular  array,  and  IF  =  IR. 
The  support  of  t  at  target  point  y  is  shown  in  shaded  form. 


Figure  3.  Example  of  a  2-D  template. 
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templates  have  supports  that  can  vary  in  shape  and  weights  at  different  target 
points  [63,68]. 

2.1.5  Operations  Between  Images  and  Templates 

Several  common  image-template  operations  as  defined  in  the  image  algebra 
are  used  to  describe  image  transformations  that  make  use  of  local  and  global 
convolutions  and  to  change  the  dimensionality  or  size  and  shape  of  images.   Such 
transformations  include  image  rotation,  zooming,  image  reduction,  masked 
extractions  and  matrix  multiplication.  In  an  effort  to  generalize  the  definition  of 
basic  operations  between  images  and  templates,  the  global  reduce  operation  and 
the  generalized  product  between  an  image  and  a  template  are  introduced  [63]. 

Let  X  c  R     be  finite,  where  X  =  {xv  x2,-  •  • ,  x^.  If  7  is  an  associative 
and  commutative  binary  operation  on  the  value  set  F,  then  the  global  reduce 
operation  T  on  F     induced  by  7  is  defined  by 


ra  =  r   a(x)  =  a(Xl)  7  a(x2)  7  ...  7  a(xj,  (2) 

xex 


where  a  6  F     and  T  is  the  mapping  function  T  :  F    -»F. 

Image-template  operations  are  defined  by  combining  an  image  and  a 
template  with  the  appropriate  binary  operation.  This  is  accomplished  by  means  of 
the  generalized  product,  specifically,  the  generalized  backward  and  forward 
convolution  operations  [22,63].   Let  fv  F2  and  F  be  three  value  sets,  Y  c  Rm,and 
O :  Fj  x  F2  — ►  F  be  a  binary  operation.  If  7  is  an  associative  and  commutative 

X  TV 

binary  operation  on  F,  a  6  F  i  and  t  e  (F2)  ,  then  the  generalized  backward 
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convolution  operation  (the  generalized  forward  convolution  operation  is  omitted 

T  TV  V 

here)  of  a  with  t  is  the  binary  operation  © :  F  t  x  (F2)    — ►  F     defined  by 


a  ©  t  e  {(y,b(y)):  b(y)  =T   a(x)  O  ty(x),  y  e  Y}.  (3) 


Notice  that  the  input  image  a  is  an  Fj-valued  image  on  the  point  set  X 
while  the  output  image  b  is  an  F -valued  image  on  the  point  set  Y.  Templates 
can  therefore  be  used  to  transform  an  image  with  range  values  defined  over  one 
point  set  to  an  image  on  a  completely  different  point  set  with  entirely  different 
values  from  the  original  image  [22,63]. 

A  wide  variety  of  image-template  operations  can  be  derived  by  substituting 
various  binary  operations  for  7  and  O  in  the  definitions  stated  above  for  the 
generalized  backward  and  forward  template  operations  (see  Figure  4)  [63].  There 
are,  however,  three  basic  image-template  operations  defined  in  the  image  algebra 
that  are  sufficient  for  describing  most  image  transformations  encountered  in 
current  areas  of  image  processing  and  computer  vision  applications.  The  three 
basic  operations  that  are  considered  in  this  dissertation  are  generalized  convolution, 
additive  maximum,  and  multiplicative  maximum,  and  are  denoted  by  ©,  0,  and  © 
respectively.   The  operation  ©  is  a  linear  operation  that  is  defined  for  real  and 
complex  valued  images.  The  other  two  operations,  0  and  ©,  are  nonlinear 
operations  which  are  applied  to  extended  real  valued  images  [22,63]. 

Let  X  c  R71  be  finite  and  Y  c  Rm.  If  a  e  R1  and  t  e  (RX)Y,  then  the 
backward  generalized  convolution,  see  Figure  5,  (the  forward  direction  is  omitted 
here)  is  defined  as 


a©t  e  {(y,c(y))  :  c(y)  =  £  a(x)-ty(x),  y  6  Y}.  (4) 

xei 
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X  T      Y 

If  a  G  R       and  t  G  (R    )  ,  then  the  backward  additive  maximum  operation 


- 00  v     — CD 

is  defined  as 


a  B  t  =  {(y,c(y))  :  c(y)  =  V  a(x)+t  (x),  y  G  Y},  (5) 

xGI 


where    V  a(x)  +  t  (x)  =  max  {a(x)  +  t  (x)  :ieX}. 
xGI 


If  a  G  (R  7   )     and  t  G  ((R-   )  )  ,  then  the  backward  multiplicative 


+ar  u    +ao 

maximum  operation  is  defined  as 


a©t  e  {(y,c(y))  :  c(y)  =  V  a(x)-t  (x),  y  G  Y}.  (6) 

xGI 


The  complementary  operations  of  additive  and  multiplicative  minimum  (  0 
and  ® )  are  defined  in  terms  of  the  additive  and  multiplicative  dual  of  images  and 
templates  and  the  image-template  operations  of  0  and  ©  [63].  The  relationship 
between  the  additive  maximum  and  minimum  is  given  in  terms  of  lattice  duality  by 


*       *,  * 


aHt  =  (t  0a  )  , 


I   \Y 


where  the  image  a    is  defined  by  a  (x)  =  [a(x)]  ,  and  the  dual  of  t  G  (RtJ     is 
the  template  t*  G  (n£jx  defined  by  t*(y)  =  [ty(x)]*.  It  follows  that  the  additive 
dual  can  be  simply  defined  as  tx(y)  =  — t  (x).  A  similar  correspondence  for  the 

multiplicative  dual  is  defined  by  Ritter  et  al.  [63]. 
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a(7)t  ={  (y,b(y)):  b(y)  =  r    a(x)ot  (x),  ycv} 
^        l  xeX  y  J 


Replace 


® 

T    and  0 
xeX 

Symbol: 

with: 

Resulting  Operation: 

© 

y    and    * 

X6X 

Spatial  Convolution  or  Correlation  [62] 

^D 

U   and   fl 
xeX 

Set  Theoretic  Dilation  [66] 
(where  IF  is  a  set  of  sets) 

0 

V    and   j_ 

X€X 

Morphological  Dilation  (additive)  [62] 

© 

V    and    * 
xeX 

Multiplicative  Lattice  Dilation  [62] 

QlA 

OR  and  AND 
xeX 

Boolean  Dilation  [40] 

Figure  4.   Examples  of  generalized  convolution  substitutions. 
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►  X 


b  =  a©t  ={  (y,b(y)):b(y)=5]  a(x)t  (x),  y€  y} 

xeX 


Let  X  be  a  finite  subset  of  zf  Y  =  22  c  |Rf  and  choose  |F=  |R. 
Suppose  afFxand  t«(Fx)*  then  where  ever  the  image  ty  intersects 
the  image  a,  the  convolved  value  b(y)  will  be  located  at  the  target  point  y. 


Figure  5.  Example  of  a  2-D  generalized  backward  convolution. 
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2.1.6  Operations  Between  Generalized  Templates 

The  basic  binary  operations  between  generalized  templates  are  the  same 
pointwise  operations  inherent  in  the  value  set  F  [22].  These  are  the  same 
operations  defined  between  images. 

The  image-template  operations  of  ©,  0,  and  ©  generalize  to  operations 
between  templates.  When  used  as  template— template  operations,  these  image 
algebra  operations  enable  large  invariant  templates  to  be  decomposed  into  smaller 
invariant  templates  thereby  providing  a  means  for  mapping  image  transformations 
to  various  computer  architectures  [68].  This  process  is  called  template 
decomposition.   Combining  two  or  more  templates  with  the  operations  of  ©,  0 
and/or  ©  is  called  template  composition. 

2.2  Digital  Optical  Cellular  Image  Processor  (DQCIP)  -  A  Comparison 

The  concept  of  optical  image  processing  has  been  an  item  of  considerable 
research  [5,26].   Particular  interest  has  been  paid  to  optical  array  methods  which 
remove  the  interconnection  bottleneck  common  to  conventional  semiconductor 
parallel  architectures  [29,40,41].  The  reason  for  evaluating  the  DOCIP  is  to  show 
that  although  this  highspeed  optical  image  processor  has  a  limited  potential  for  use 
in  an  ATR  problem  solution,  it  does  not  have  the  robust  image  algebra  operational 
capability  of  the  designs  presented  in  this  dissertation. 

2.2.1  General  Description 

The  DOCIP,  presented  by  Huang  et  al.,  is  designed  to  provide  parallel 
image  processing  tasks  using  the  inherent  optical  parallelism  and  3-D  free  space 
interconnection  capabilities  [41].  Both  cellular  array  and  cellular  hypercube 
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designs  are  presented  in  [41].  Only  the  hypercube  design,  with  its  higher  degree  of 
connectablity,  is  of  interest  here  since  it  has  the  better  execution  time  complexity, 
0(logN)  versus  0(NlogN)  for  the  cellular  array  [41]. 

The  DOCIP  is  intrinsically  a  fine  grained  SIMD  machine  where  each  cell 
operates  with  low  hardware  complexity.  The  hypercube  architecture  is  used  to 
manipulate  structuring  element  locations  and  configurations,  either  4  or  8 
connectedness  in  the  cellular  automata  sense,  with  respect  to  each  pixel  in  an  NxN 
image.  The  overall  design  of  the  architecture  is  tightly  coupled  to  the  underlying 
binary  image  algebra  constructs. 

2.2.2  Binary  Image  Algebra 

The  programming  language  initially  developed  for  the  DOCIP  was  later 
established  as  a  coherent  homogeneous  algebra  referred  to  as  binary  image  algebra 
(BIA)  [29,41].  BIA  was  another  attempt  at  a  unifying  theory.  Its  formulation  was 
based  on  a  set  of  three  specific  fundamental  operations  taken  from  the  structures  of 
mathematical  morphology  [67]. 

Mathematical  morphology  is  used  to  determine  structure  in  a  subimage  by 
transforming  underlying  modeling  sets.  The  modeling  sets  are  sets  of  points 
known  as  structuring  elements.  The  structuring  elements,  defined  in  the  image 
algebra  by  the  support  of  a  template,  are  designed  in  a  such  a  way  as  to  reveal  the 
spatial  characteristics  of  objects.  Morphology  used  as  an  algebra  for  image 
processing  fails  as  a  universal  image  algebra  due  to  its  set— theoretic  formulation 
[57].  Conceptually,  in  a  Turing  machine  sense,  it  is  possible  to  use  these 
set -theoretic  operations  in  the  linear  domain,  transformations  between  different 
domains,  and  transformations  between  different  value  sets;  however,  the 
practicality  is  beyond  that  which  is  reasonable.   On  the  other  hand,  the  image 
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algebra  defined  by  Ritter  et  al.  does  not  fail  in  this  regard.   Furthermore,  it 
includes  all  morphological  operations,  either  binary  or  gray  scale,  as  a  subalgebra 
[22]. 

Formally,  the  binary  image  algebra  is  defined  by  a  set  S  and  family  of 
operations  F  over  that  set.   The  set  S  is  further  defined  as  the  power  set  of  a 
predefined  universal  image   W,  denoted  P(  W)  [41].  The  universal  image  is  the  set 

^={(x,y):xeZ±n,yeff±n},  (7) 

where  1     =  {  0,  ±1,  ±2,  •  •  •,  ±n  }  and  n  is  a  positive  integer.   Unfortunately, 
this  definition  does  not  directly  convey  the  notion  that  this  is  an  algebra  over 
binary  images.   Additional  constraints  are  added  which  restrict  the  values  at  each 
point   (x,y)  to  1  (foreground)  and  0  (background). 

The  power  set   P{  W)  can  be  defined  simply  as  the  set  of  all  images  such 
that  the  picture  elements  (pixels)  of  each  image  belong  to  the  value  set   {0,  1}. 
The  image  algebra  defined  by  Ritter  et  al.  defines  any  image  belonging  to  this 
power  set  as 

a={(x,a(x)):xeX},  (8) 


where  a(x)  €  {1,  0}  and  x  is  any  vector  belonging  to  the  point  set  1±  *  I±  .  In 
particular,  in  Ritter's  algebra  the  set  I  ,  where  1  =  {0,  1}  and  X  =  I±  *  I±  , 
corresponds  to  Huang's  universal  set   5  of  Boolean  images  on  P(  W). 

The  family  of  operations  F  is  made  up  of  three  fundamental,  non  0-ary 

* 
operations:  dilation,  union,  and  complement,  denoted  ©  (*  is  used  to  avoid 

confusion  with  the  image  algebra  notation),  U,  and  ",  respectively.  A  subset  of 
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P{  W)  containing  five  images  referred  to  as  elementary  images  is  used  to  derive 
the  necessary  structuring  elements  needed  to  perform  all  morphological  functions 
of  the  three  fundamental  operations  [41].  The  elementary  images  are  denoted  /, 
A,  A]1  B,  B:x  Huang  denotes  his  binary  image  algebra  symbolically  by 

BIA  =  (  P{  W);  e  ,  U,  -,  /,  A,  A;'  B,  B+ ).  (9) 

Again,  we  must  point  out  that  this  notation  does  not  specify  the  algebra 
completely  since  the  power  set  P{  W)  is  not  the  same  as  the  set  of  all  Boolean 
images  on  P(W). 

Two  of  the  three  fundamental  operations,  union  and  complement,  are  basic 
operations  of  any  useful  set -theoretic  algebra  (BIA  and  the  image  algebra  defined 
in  [63]  are  no  exceptions)  and  need  no  further  clarification.   The  dilation  operation, 
in  the  strict  morphological  sense,  is  the  union  of  all  input  image  translations  by 
the  points  in  the  structuring  element  [67].   BIA  defines  it  as 


* 


X®  R=  (10) 

{(X!+  X2'     yi+  y2}  G    W:  {Xl'y?  G  X>  (VP  ZR}(X*<1>)*(R*  <P) 

<p      otherwise, 


where  X  is  an  input  image,  R  is  a  reference  image  derived  using  the  three 
fundamental  operations  in  combinations  with  one  or  more  of  the  five  elementary 
images,  <p  is  the  null  image,  and  A  means  logical  and.   Once  again,  the  definition 
does  not  clearly  express  the  intent  of  morphological  dilation  nor  that  the  result  of 
the  operation  is  a  binary  image.  The  underlying  mathematical  ring  and  lattice 
structures  are  never  specified,  leaving  the  reader  to  ponder  the  action  of  the  binary 
operator  +.  If  in  a  Boolean  sense,  the  operator  represents  an  or  operation,  then 
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the  dilation  notion  is  reasonable.   The  dilation  required  by  BIA  is  formulated  by 
collectively  or-ing  a  sequence  of  translations  of  the  input  image  (translated  with 
respect  to  each  element  in  the  reference  image)  each  being   and-ed  with  the 
reference  image  (refer  to  Figure  4). 

The  image  algebra  by  Ritter  et  al.,  using  the  concepts  of  value  and  point 
sets,  readily  defines  morphological  dilation  in  this  instance.  Substituting  the 
global  reduce  operation  maximum,  V,  and  the  multiplicative  binary  associative 
operator,    • ,  in  the  generalized  matrix  product  form,  the  binary  morphological 
dilation  is  then  defined  as 

a  ©t  e  {(y,c(y)):  c(y)  =  \/  a^-t^x),  y  e  X},  (11) 

xei 

where  input  image  a  e  (2  )  ,  template  (BIA  reference  image)  t  6  ((2  )  )  ,  and 
the  point  set  X  =  1    *  I     defined  above.  Figure  6,  an  example  taken  from  [41], 
shows  the  dilation  operation  for  input  and  reference  images.  Without  further 
proof,  it  is  clear  that  for  binary  images  the  multiplicative  maximum  operation 
defined  in  [63]  is  equivalent  to  the  dilation  operation  defined  for  binary  image 
algebra. 

2.2.3  Evaluation 

Since  the  three  fundamental  operations  defined  by  the  binary  image  algebra 
have  equivalent  definitions  in  the  image  algebra  by  Ritter  et  al.,  and  the  sets 
which  define  any  input  and  reference  images  (templates)  are  also  equivalent,  BIA 
can  be  considered  to  be  a  subalgebra  of  the  image  algebra  in  [63];  that  is,  the 
image  algebra  [63]  can  perform  any  operation  combination  of  the  BIA.   This  has 
been  proven  rigorously  by  Davidson  [22]. 
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Figure  6.   Dilation  (Boolean)  expressed  in  the  language  of  BIA  and  image  algebra. 
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As  we  shall  show,  the  time  complexity  for  binary  only  operation  of  the 
optical  system  described  in  Section  3.5  is   0(1).   Since  it  can  perform  all  the 
operations  of  the  DOCIP  (those  defined  in  the  BIA)  which  has  operating  time 
complexity   0(logN)  [41],  the  binary  only  design  configuration  will  run  faster  for 
roughly  the  same  cost  function,  verifying  the  superiority  of  the  proposed  design 
over  the  DOCIP. 


CHAPTER  3 
CONFIGURATION  DESCRIPTION 


Optical  correlators  provide  not  only  a  solution  to  the  ATR  problem,  but  a 
method  for  target  identification  as  well.  The  complete  action  of  target 
identification  is  accomplished  by  cross -correlating  the  Fourier  transform  of  an 
image  scene  containing  a  target  with  a  transformed  image  of  the  target.  This 
methodology,  known  as  matched  filtering,  has  a  computational  time  complexity  of 
order   0(1)  which  is  necessary  for  high  response  rate  systems. 

The  correlator  design  described  in  this  chapter  is  an  integral  part  of  a 
proposed  autonomous  guidance  system  for  a  munition  directed  to  defeat 
high-valued,  relocatable  ground  targets  [19].  Although  the  correlator  will  be  the 
primary  means  of  automatically  recognizing  the  target,  a  large  portion  of  a 
system's  guidance  still  depends  upon  high  throughput  image  processing  as  shown 
in  Figure  7.  The  hybrid  electro-optical  design  described  in  this  dissertation  offers 
a  solution  to  this  image  processing  requirement.  It  uses  the  correlator  as  a  means 
of  rapidly  translating  the  input  image  over  an  array  of  fine  grained  SIMD  digital 
processors  that  allow  for  the  execution  of  the  arithmetic  and  logic  operations  of  the 
image  algebra  discussed  in  Chapter  2.   Guidance  algorithms  that  are  composed 
using  the  image  algebra  are  capable  of  satisfying  the  guidance  requirements  prior 
to  and  during  the  cross -correlation  portion  of  the  guidance  solution.  The  benefit 
here  is  a  combined  system,  not  two  separate  processing  activities.  This  equates  to 
lower  packaging  volume  and  weight;  these  requirements  are  critical  to  the  overall 
design  of  any  munition  guidance  system. 
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3.1  Correlator  Operation  for  ATR 

The  cross-correlation/matched  filter  operation  referred  to  in  this 
dissertation  is  accomplished  using  a  conceptual  correlator  design  based  on  a 
conventional  Fourier  optical  system  as  shown  in  Figure  8  [55].  The  input  image 
information  can  be  from  any  sensing  device  that  produces  a  digital  image,  e.g., 
passive  Forward-Looking-InfraRed  (FLIR),  Laser  Radar  (LADAR),  Synthetic 
Aperture  Radar  (SAR),  etc. 

3.1.1  Spatial  Light  Modulation 

The  most  critical  component  of  the  optical  correlator  described  in  this 
chapter  is  the  spatial  light  modulator  (SLM).   A  spatial  light  modulator  is  a 
two-dimensional  array  of  photonic  switches  that  modulate  the  intensities,  phase, 
or  polarizations  of  the  passing  light  beams.   The  conceptual  correlator  design 
discussed  in  this  dissertation  requires  a  SLM  that  can  provide  high  resolution 
(minimum  128x128  pixels),  0  —  255  pixel  gray  level  values,  and  a  high  switching 
frequency  (40KHz  is  used  as  a  baseline  for  this  evaluation).  Although  the 
state-of-the-art  does  not  currently  provide  all  the  requirements  in  a  single  SLM, 
research  shows  that  individually,  the  requirements  are  met  and  materials  are 
available  to  provide  the  combination  of  these  properties  in  a  single  SLM 
[4,8,20,25,53,65]. 

The  most  promising  type  of  SLM  that  can  be  expected  to  meet  the 
requirements  of  the  proposed  design  is  a  ferroelectric  liquid  crystal  (FLC)  SLM. 
The  liquid  crystal  material  is  chiral  smectic  A  [43].   In  a  smectic  liquid  crystal 
material,  the  molecules  are  arranged  parallel  to  each  other  so  that  diffraction 
patterns  can  only  be  obtained  in  a  single  direction.  In  the  chiral  smectic  A 
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material,  the  molecules  are  oriented  collinear  to  an  axis  that  is  orthogonal  to  the 
plane  containing  the  smectic  material  (see  Figure  9).   Application  of  an  electric 
field  causes  the  molecular  director,  the  null  direction,  to  tilt  in  linear  proportion  to 
the  strength  of  the  applied  electrical  field.  This  is  referred  to  as  an  electroclinic 
effect.  This  electroclinic  effect  or  switch  provides  gray  level  response  at  the 
switching  frequency  [8]. 

Power  dissipation  is  a  primary  limitation  of  the  switching  speed  of  most 
other  SLMs.  The  power  dissipation  for  FLCs  in  general  is  low;  in  fact,  it  is 
inversely  proportional  to  the  switching  speed  [43]. 

Smectic  A  devices  have  demonstrated  100ns  switching  speeds,  therefore, 
assuming  a  naive  line  multiplex  scheme,  128  frames  (one  line  of  a  128x128  image) 
x  100ns  equates  to  a  78KHz  frame  switching  frequency  [43].  As  this  technology 
matures,  submicrosecond  smectic  A  devices  have  the  potential  for  frame  switching 
frequencies  at  MHz  rates. 

3.1.2  Optical  System  Throughput 

The  incoming  digital  image  information  is  multiplexed  to  the  input  SLM 
arranged  first  in  the  correlator.  The  output  of  a  single  wavelength  laser  is 
collimated  and  polarized  to  produce  a  plane  wave  of  coherent  visible  light.  As  the 
polarized  plane  wave  passes  through  the  elements  of  the  SLM,  portions  of  the 
plane  wave  are  retarded  in  phase  as  shown  in  Figure  10.  The  amount  of 
retardation  corresponds  to  the  gray  level  value  of  the  digital  image  at  each 
respective  pixel  location.  The  emerging  wave  is  no  longer  perfectly  planar  but 
rather  a  phase  modulated  wave.  The  phase  modulated  wave  is  then  passed 
through  a  Fourier  transform  lens  (FTL)  where  the  power  spectrum  of  the  Fourier 
transform  of  the  image  information  is  formed  at  the  focal  plane  of  the  lens. 
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The  spontaneous  polarization  P  of  the  smectic  material  is  perpendicular  to  the 
perpendicular  to  the  y-z  plane  (tilt  plane).  The  application  of  an  electric  field 
produces  a  torque  through  a  first-order  interaction  with    P  such  that  the 
average  molecular  lattice  director   D  is  displaced  by  cf}.  As  the  plane  wave 
of  coherent  light  passes  through  the  crystal  lattice,  any  change  in  dp>  will 
produce  a  phase  change  in  the  plane  wave  (i.e.,  phase  modulation). 


Figure  9.  Ferroelectric  liquid  crystal  cellular  geometry. 
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A  second  SLM  is  aligned  with  the  focal  plane  of  the  FTL.  Fourier 
transformed  phase  information  for  the  complex  conjugate  of  the  reference  image 
(target)  is  placed  on  the  SLM,  and  the  emerging  wave  is  further  modulated  in 
phase.  The  effect  of  this  second  SLM  is  multiplicative  with  respect  to  the  energy 
passing  through  the  system.  Successful  cross -correlation  is  achieved  using 
phase-only  filters.  These  are  known  as  kinoforms,  i.e.,  not  relying  on  diffraction 
by  carrier  fringes  for  phase  modulation  [39].  When  used  as  a  matched  filter,  the 
filter  in  the  Fourier  plane  need  only  have  the  same  space-bandwidth  product  as 
the  input  image  [8]. 

The  cross-correlated  information  is  transformed  back  to  the  spatial  domain 
by  passing  through  a  second  FTL  located  at  the  same  focal  length  behind  the  filter 
which  performs  the  inverse  Fourier  transform.  The  entire  cross -correlation  process 
is  mathematically  notated  as 


Cfh  =  r'{S[f(x,y)].S*[h(x,y)]} 
=  rl{  F(u,v)  •  H*(u,v)  }, 

where  §  is  the  Fourier  transform  operation,  f(x,y)  represents  the  input  image, 
F(u,v)  is  the  Fourier  transform  of  f(x,y),  h(x,y)  represents  the  reference  or  target 
image,  H(u,v)  is  the  Fourier  transform  of  h(x,y),  and  *  is  the  complex  conjugate. 
Also,  note  that  the  autocorrelation,  C   ,  occurs  when  H  (u,v)  =  F  (u,v). 

The  cross-correlation  information  is  now  passed  through  an  analyzer, 
consisting  of  a  single  polarizer,  which  regulates  the  intensity  with  respect  to  the 
amount  of  phase  retardation.  Finally,  the  intensities  are  collected  on  a  charge 
coupled  device  (CCD)  array  where  they  are  converted  back  to  a  digital  form.  If 
there  is  a  match,  a  correlation  spike  is  formed  at  the  CCD  array.  Because  the 
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Fourier  transform  is  linear  shift  invariant,  the  correlation  spike  can  occur 
anywhere  in  the  input  image  where  there  is  a  match.   Once  the  target  is  detected, 
a  position  update  is  passed  to  the  guidance  and  control  unit.   According  to  the 
particular  guidance  law  being  used,  the  system  is  steered  in  such  a  manner  that 
the  correlation  spike  moves  to  either  the  center  position  for  zero  error,  pure 
pursuit;  or  to  an  offset  aim  point  for  lead  pursuit,  selective  aim  point. 

3.1.3  Composite  Matched  Filter 

Two  factors,  rotation  and  scale,  induce  variation  in  the  matched  filter, 
correlator  operation.  These  variations  are  due  to  the  geometrical  distortions 
inherent  in  viewing  a  dynamically  changing  3-D  scene.   For  this  correlator 
configuration,  these  variations  are  handled  using  a  technique  known  as  composite 
matched  filter  (CMF,  defined  by  Horner  and  Caulfield  [38]). 

One  method  of  dealing  with  the  above  variations  is  to  integrate  a  system  of 
different  focal  length  lenses  with  separate  filters  that  present  views  of  the  target 
over  a  finite  number  of  azimuths  and  elevations.   Although  this  system  would 
provide  satisfactory  results,  it  is  far  too  cumbersome  to  implement.   Due  to  the 
linearity  of  the  correlator  as  a  system,  the  same  results  can  be  obtained  by 
superimposing  a  sufficient  number  of  filters  into  one  composite  matched  filter. 
This  single  filter  is  constructed  as  a  sum  of  the  Fourier  transformed  reference 
images  and  is  notated  as 


CMF 

i=l 


=  £  VMF, 


where  k  is  the  total  number  of  matched  filters  (MF),  and  w  is  a  weighting 
function  for  each  filter  that  modifies  the  output  intensities  to  provide  a  constant 
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correlator  output  for  each  filter.  When  the  reference  images  are  combined  before 
being  transformed,  the  result  is  referred  to  as  a  synthetic  discriminant  function 
[38,44,46]. 

A  sufficiently  large  set  of  composite  matched  filters  may  now  define  an 
operational  volume  for  the  correlator  (refer  to  Figure  7).  The  set  of  composite 
matched  filters  is  continuously  cycled  through  the  respective  SLM  until  a 
correlation  occurs  or,  in  the  event  of  no  correlation,  a  decision  is  made  to  switch  to 
a  lower  level  of  image  processing. 

3.1.4  Post  Correlation  Processing 

Once  a  correlation  is  made,  a  correlation  spike  is  formed  on  the  CCD  array. 
The  location  of  the  spike  must  be  quickly  processed  to  determine  its  relative 
position  to  either  the  center  null  position  or  any  particular  offset  aim  point.  This 
operation  is  ideally  suited  for  the  design  configuration  presented  in  this  chapter. 
Using  the  characteristic  function  \  described  in  Section  2.1.3,  the  digital  values 
above  the  background  noise  are  set  to  unity  while  all  others  are  set  to  zero.  This 
allows  for  a  rapid  boolean  readout  of  the  location  of  all  Is  on  an  array  of  Os. 

A  more  elegant,  yet  highly  theoretical,  design  configuration  is  the  notion  of 
on  focal  plane  array  digital  computation.  The  fine  grained  computing  elements 
necessary  for  a  SIMD  design  are  formed  using  nanoelectronics,  an  integrated 
components  technology  developed  by  the  Central  Research  Laboratories  of  Texas 
Instruments,  Inc.  [28].  The  technology  is  based  on  heterostructure  fabrication 
methods  that  integrate  low  level  operations  (e.g.,  addition,  multiplication, 
comparison,  etc.)  with  the  output  of  each  element  of  the  CCD  array  located  at  the 
element  [28].  Although  the  hardware  complexity  at  this  level  is  very  low,  it  is 
sufficient  to  host  a  subset  of  image  algebra  operators  to  perform  a  wide  range  of 
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operations  (specifically,   x)-  The  benefit  here  is  that  the  steering  information  can 
be  interpreted  on  the  CCD  with  computational  time  complexity   0(1). 

3.2  The  Logical  Architecture 

The  logical  computer  architecture  described  in  this  section  is  specifically 
designed  for  the  image  processing  tasks  required  of  an  airborne  vehicle  guidance 
system  when  the  vehicle  is  outside  the  correlator's  operational  volume  (pre/post 
processing,  refer  to  Figure  7). 

The  proposed  design  is  a  hybrid  electro-optical  architecture  consisting  of 
three  tightly  coupled  components:  a  spatial  configuration  processor,  a  weighting 
processor,  and  an  accumulation  processor  as  shown  in  Figure  11.  These 
components  are  based  on  the  fact,  proven  in  Section  3.3.1,  that  a  general 
convolution  of  type  a  ©  t,  for  applicable  7  and  O,  is  equivalent  to 


f  (aet(i))Oc  , 

i  =  l  1 


where  n  is  the  number  of  elements  in  the  template,  a  ©  t(i)  is  simply  a  shift,  and 
c    is  the  scalar  value  or  weight  of  the  respective  template  element.  The  flow  of 

i 

data  and  image  processing  operations  are  directed  by  a  control  buffer  and  pipelined 
to  each  of  the  three  processing  components  as  shown  in  Figure  12.  Table  1 
presents  a  component  level  description  for  the  operations  required  of  each 
processing  element. 

The  spatial  configuration  processor  is  a  stepwise  optical  convolution  of  the 
original  input  image  with  each  template  location.  Figure  13  shows,  conceptually, 
an  optical  system  for  controlling  the  input  of  each  template  element.  A  unit  value 
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Figure  11.  Data  flow  diagram  for  implementing  a  ©  t. 
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Figure  12.  Pipelining  the  processors. 
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Table  1.   Component  design  requirements- 


Component 

Spatial  configuration 
processor 


Weighting  processor 


Accumulation  processor 


Control  buffer 


Design  Requirement 

A  Fourier  optical  correlator  design  using 
binary  and  high  resolution  (8-bits) 
SLM  technology  (chiral  smectic  A 
liquid  crystal  —  FLC)  and  a  charged 
couple  device  (CCD  array)  to  provide  the 
output  image  in  digital  form  —  frame 
switching  frequency  —  40  KHz. 

A  parallel  array  processor  with  directly 
connected  input  from  the  CCD  array  and 
the  capability  to  load  a  given  scalar  value 
to  each  arithmetic  logic  unit  (ALU).  A 
scalar  value  will  be  loaded  into  the 
template  register.   Each  ALU  must  have  a 
multiplier,  an  adder,  and  a  modified 
comparator  (Figure  15).  The  output  image 
will  be  the  result  of  combining  the  input 
image  with  the  scalar  using  one  of  the 
following  operations:   +;  •;  <;  =;  >; 
(and);  and  (or)  (Figure  16)  [61]. 

A  parallel  array  processor  with  directly 
connected  input  from  the  weighting 
processor.  Each  ALU  will  be  configured 
identical  to  those  in  the  weighting 
processor  with  the  addition  of  and  an 
accumulator  (see  Figure  16).  The 
accumulator  can  be  loaded  directly  by  the 
analog,  no  shifting  and  no  scalar  operation. 
The  accumulated  value  (video  out)  will  be 
the  result  of  combining  the  two  images 
using  one  of  the  seven  operations  described 
above. 

This  is  the  host  controller  (e.g.,  a 
Motorola  68040  CPU)  that  directs  all  I/O 
handling  and  data  flow.  A  translator  or 
compiler  must  be  designed  or  modified  to 
parse  the  specific  instructions  required  to 
interpret  the  algebraic  syntax  and  direct 
the  execution  of  the  other  components. 
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is  assigned  to  the  template  element  for  each  convolution.  Thus,  a  sine  function  is 
formed  and  is  inherently  multiplied  in  the  frequency  domain.  The  result  of  each 
convolution  is  simply  a  shift  of  the  input  image,  that  is,  each  picture  element, 
pixel,  is  shifted  by  the  displacement  of  the  respective  template  element  as  shown 
in  Figure  14.  If  the  operation  being  directed  by  the  control  buffer  is  a  pointwise 
image-to-image  operation,  then  the  input  image  is  sent  directly  to  the  array 
processor  controlling  the  accumulation  operation. 

The  output  of  the  spatial  configuration  processor  is  combined  pointwise 
using  the  appropriate  binary  associative  operator  from  the  algebraic  set  of 
operators  with  the  value  of  the  respective  template  element  as  shown  in  Figure  15. 
This  weighting  procedure  is  performed  on  a  digital  array  processor  using  the 
binary  associative  operators  assigned  to  each  arithmetic  logic  unit  shown  in  Figure 
16.  The  output  of  the  weighting  processor  is  now  accumulated  pointwise.   Again, 
this  procedure  is  performed  using  a  digital  array  processor.   Once  the  final 
template  element  is  processed  in  this  fashion,  the  accumulator  memory  contains 
the  result  of  the  generalized  matrix  product  for  the  appropriate  combine  and 
reduce  operations.  Recall  that  in  our  discussion,  the  generalized  matrix  product  is 
the  mathematical  formulation  for  operations  such  as:  general 
convolution/correlation,  additive  maximum/minimum,  and  multiplicative 
maximum/minimum. 

The  generalized  matrix  operations,  or  template-to-image  operations,  along 
with  the  direct  image-to-image  unary  and  binary  operations  allow  for  the 
formulation  of  all  common  image  processing  tasks.   Hence,  this  architecture  will 
control  the  execution  of  all  the  necessary  set  operations  of  the  underlying  image 
algebra. 
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3.3  Image  Algebra  Process  Distribution 

All  image-to-image  operations  are  performed  pointwise  in  one-to-one 
correspondence  with  respective  pixels.  Table  2  shows  each  operation  and  the 
corresponding  data  flow  or  process  distribution. 

Characteristic  functions,  as  defined  in  the  image  algebra,  apply  their 
respective  conditional  arguments  to  an  image  and  output  a  binary  image  consisting 
of  Is  where  the  condition  is  true  and  Os  elsewhere.  By  properly  tailoring  the 
Boolean  output  of  a  comparator's  circuit,  each  conditional  argument  can  be 
processed  as  shown  in  Figure  16.  Table  3  shows  the  algebraic  notation  and  the 
corresponding  data  flow  or  process  distribution. 

The  global  reduce  operations  S  ,  V  ,  A  ,  II  ,  and  the  boolean  or  transform 
an  image  to  a  scalar.   Since  the  proposed  architecture  does  not  consist  of  a  mesh  or 
any  interconnection  among  pixels,  these  operations  are  best  performed  by  a  host 
computer. 

The  image  algebra  basically  defines  templates  in  two  ways:  invariant, 
where  neither  the  template's  support  nor  its  values  change  with  respect  to  the 
target  point;  and  variant,  where  the  above  constraint  is  not  true  for  either  or  both 
sets.  The  intent  of  the  proposed  design  is  to  use  invariant  template  operations. 
The  reason  for  this  is  that  the  throughput  rate  is  maximized.  Variant  template 
operation  is  discussed  in  Section  3.6.   Table  4  shows  the  algebraic  notation  for 
image-to-template  operations  and  their  corresponding  data  flow  or  process 
distribution. 

The  image  algebra  operations  that  have  been  shown  to  distribute  over  the 
proposed  architecture  represent  a  substantial  increase  in  throughput  and  efficiency. 
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Table  2.   Image-to-image  operations- 


Image  Algebra  Process  Distribution 

k  +  a  parallel  step  1:  load  the  value  k  into  the 

template  register  and  image  a  into  the 
weighting  processor  ALU; 
parallel  step  2:  accumulate  9  ACCUM(x) 
=  k  +  a(x). 

k  •  a  parallel  step  1:  load  the  value  k  into  the 

template  register  and  image  a  into  the 
weighting  processor  ALU; 
parallel  step  2:  accumulate  9   ACCUM(x) 
=  k  •  a(x). 

a  +  b  parallel  step  1:  load  the  accumulator  with 

image  a  and  the  weighting  processor  ALU 
with  image  b  (no  shift); 
parallel  step  2:  accumulate  9  ACCUM(x) 
=  a(x)  +  b(x). 

a  —  b  parallel  step  1:  load  the  accumulator  with 

image  a  and  the  weighting  processor  ALU 
with  image  — b  (no  shift); 
parallel  step  2:  accumulate  9  ACCUM(x) 
=  a(x)  +  (-b(x)). 

a  •  b  parallel  step  1:  load  the  accumulator  with 

image  a  and  the  weighting  processor  ALU 
with  image  b  (no  shift); 
parallel  step  2:  accumulate  9  ACCUM(x) 
=  a(x)  •  b(x). 

a  /  b  parallel  step  1:  load  the  accumulator  with 

image  a  and  the  weighting  processor  ALU 

with  image  b_1(no  shift); 

parallel  step  2:  accumulate  9  ACCUM(x) 

=  a(x)  •  b-^x),  where  ACCUM(x)  =  0 

when  b_1(x)  =  0. 
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Table  2  -  continued 

Image  Algebra  Process  Distribution 

a  and  b  parallel  step  1:  load  the  accumulator  with 

binary  image  a  and  the  weighting 
processor  ALU  with  binary  image  b 
(no  shift); 

parallel  step  2:  use  the  boolean  output  of 
each  comparator  3  ACCUM(x)  = 
a(x)  andb(x). 

a  or  b  parallel  step  1:  load  the  accumulator  with 

binary  image  a  and  the  weighting 
processor  ALU  with  binary  image  b 
(no  shift); 

parallel  step  2:  use  the  boolean  output  of 
each  comparator  9  ACCUM(x)  = 
a(x)  orb(x). 

avb 

parallel  step  1:  load  the  accumulator  with 
image  a  and  the  weighting  processor  ALU 
with  image  b  (no  shift); 
parallel  step  2:  accumulate  using  the 
comparator  9  if  fa(x)  >  b(x)l,  ACCUM(x) 
=  a(x),  else  ACCUM(x)  =  b(x). 

a  A  b  parallel  step  1:  load  the  accumulator  with 

image  a  and  the  weighting  processor  ALU 
with  image  b  (no  shift); 
parallel  step  2:  accumulate  using  the 
comparator  3  if  [a(x)  <  b(x)l,  ACCUM(x) 
=  afx),  else  ACCUM(x)  =  b(x). 

a"1  t 

ac  hardware  control  see  Figure  16. 
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Table  2  -  continued 

Image  Algebra  Process  Distribution 

—a  parallel  step  1:  load  the  value  —1  into  the 

template  register  and  image  a  into  the 
accumulator; 

parallel  step  2:  accumulate  3  ACCUM(x) 
=  "I  •  ■«. 

Va"  t 

|  a  |  load  the  accumulator  with  image  a,  and 

mask  the  appropriate  sign  control  bit  for 
the  accumulated  value. 


log  b 

a 

i 

ab 

i 

ea 

i 

cos  a,  sin  a,  tan  a 

t 

cos_1a,  sin_1a,  tan_1a 

t 

t  This  operation  may  be  processed  more  efficiently  using  a  high-speed  sequential 
processor/math  accelerator  provided  by  the  control  buffer.  If  the  hardware 
complexity  of  the  physical  implementation  is  robust,  then  many  of  these  higher 
order  operations  can  be  included  in  the  parallel  computation. 


56 


Table  3.   Characteristic  functions- 


Image  Algebra  Process  Distribution 

X>  (a)  parallel  step  1:  load  the  accumulator  with 

image  a  and  the  weighting  processor  ALU 
with  image  b  (no  shift,  if  instead  of  image 
b  a  scalar  value  m  is  used,  simply  load 
template  register  with  m); 
parallel  step  2:  accumulate  the  boolean 
output  of  each  comparator  for  the  > 
condition,  w.r.t.  b  (or  m). 

X^  (a)  parallel  step  1:  load  the  accumulator  with 

^b 

image  a  and  the  weighting  processor  ALU 
with  image  b  (no  shift,  if  instead  of  image 
b  a  scalar  value  m  is  used,  simply  load 
template  register  with  m); 
parallel  step  2:  accumulate  the  boolean 
output  of  each  comparator  for  the  < 
condition,  w.r.t.  b  (or  m). 

Xy  (a)  accumulate  3  ACCUM  =  [*<  (a)]c. 

X<  (a)  accumulate  3  ACCUM  =  [x>  (a)]c. 

X_  (a)  parallel  step  1:  load  the  accumulator  with 

— b 

image  a  and  the  weighting  processor  ALU 

with  image  b  (no  shift,  if  instead  of  image 

b  a  scalar  value  m  is  used,  simply  load 

template  register  with  m); 

parallel  step  2:  accumulate  the  boolean 

output  of  each  comparator  for  the  = 

condition. 

X±  (a)  accumulate  3  ACCUM  =  [x_  (a)]c. 

X\     i(a)  parallel  step  1:  accumulate  image  b  = 

lm>n| 

Xs  (a)  and  store  in  image  plane  register  B 

Cm 

parallel  step  2:  accumulate  image  3 
ACCUM  =  X/  (a)  and  load  the  weighting 

in 

processor  ALU  from  B  (b,  no  shift); 
parallel  step  3:  accumulate  3  ACCUM  = 
b  •  ACCUM 
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Table  4.  Image-to-template  operations. 


Image  Algebra  Process  Distribution 

a©t  parallel  step  1:  load  image  a  into  the 

analog  processor; 

parallel  step  2:  load  the  spatial  location  of 
template  t  into  the  analog  processor  and 
the  value  of  template  t  into  the  template 
register  of  the  weighting  processor; 
parallel  step  3:  combine  the  output  of  the 
analog  processor  with  the  contents  of  the 
template  register  using  the   •   operator; 
parallel  step  4:  accumulate  the  output  of 
the  weighting  processor  using  the  + 
operator  3  ACCUM  =  ACCUM  + 
(aet(i)); 
parallel  step  5:  repeat  steps  2  through  4 

V  i  G  the  template  configuration  (pipeline). 

a  ©  t  same  as  a  ©  t,  except  the  V  operation 

replaces  the  +  operation  in  step  4. 

a  ®  t  same  as  a  ©  t,  except  the  A  operation 

replaces  the  +  operation  in  step  4. 

ant  same  as  a  ©  t,  except  the  +  operation 

replaces  the  •   operation  in  step  3,  and  the 

V  operation  replaces  the  +  operation  in 
step  4 

a  □  t  same  as  a  ©  t,  except  the  +  operation 

replaces  the  •   operation  in  step  3,  and  the 
A  operation  replaces  the  +  operation  in 
step  4 
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They  also  represent  a  very  rich  subset  of  the  image  algebra,  powerful  enough  to 
accomplish  all  common  image-to-image  transforms  [58,59]. 

3.3.1  Proof  of  Concept 

Recall  the  following  image  algebra  definitions:  the  global  reduce  function; 
Equation  2, 


T  a  =    T   a(x)  =  a(xt)  7  a(x2)  7  ■  •  •  7  aCxJ, 
xei 


the  generalized  vector -matrix  or  image-template  product;  Equation  3, 


a©t  = 


(7,  b(y)):  b(y)  =  T    a(x)  Oty(x),  y  G  Y 
xei 


the  general  convolution;  Equation  4, 


aet  = 


(7,  b(y)):  b(y)  =  £  a(x)  •  ty(x),  y  G  Y 


xGI 


where  X  and  Y  are  coordinate  sets,  a,  b  G  F  ,  F  is  an  arbitrary  value  set,   7 

Y  Y 

and  O  are  binary  associative  operators  on  F,  and  template  t  G  (F  ) . 

In  support  for  the  theorem  that  follows,  suppose  F  e  {  R,  C,  1  k  },   Y  G  Zf2, 

Y  Y 

and  t  G  (F  )     denotes  a  translation  invariant  template  with  card{  o^(t  ))  =  n. 
For  some  arbitrary  point  y  G  X,  let   {  x  (y),  xn(y),-  •  • ,  x_(y)  }  =  <2>"(ty) 
and  for  i  =  1,  2,-  •  -,n,  set  q  =  t  (x^y)). 


1 —    2""       '    iT"  v  y 
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For  parameters  {1,  2,«  •  ',11},  define  an  F-valued  parameterized  template 
by 

t(i)y(x)  = 

[  0,  otherwise 

Note  that    <^(t(i)y)  =  {  x^y)  }  for  i  =  1,  2,-  ■  •,  n. 

n 

Theorem  1:  aet  =  V  (a©t(i))  •  q.  (12) 

i  =  1 

Proof:  In  order  to  simplify  notation  we  let  ^  =  a  ©  t(i)  and  show  that 

n 


n 

aet 


i=  1 


First  note  that  aj(y)  =    V  a(x)  •  t(i)y(x) 


xGI 


=     Y     a(x)  •  t(iyx) 


xe<^(t(i)y) 


=  £  a(x)  •  t(i)y(x) 
xe{xj 


=  a(xi)  •  t(i)y(xi) 
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Thus,  at  is  simply  a  shift  of  image  a,  with  a(y)  being  replaced  by  a(xL). 

n 

Let  b  =  a  ©  t  and  d  =  \    a;  •  Cj.  In  order  to  prove  the  theorem,  we 

i  =  l 

need  to  show  that  b(y)  =  d(y),  V  y  e  X. 

By  definition,  b(y)  =  V  a(x)  •  ty(x) 

xei 

=      Y     a(x).ty(x) 


=         £  a(x)  -  ty(x) 
xe{x1,  x2,- •  •,  Xjj} 


11 

-  £  a(xj)  •  yxj 


i=l 


11 


i=l 


it 

=  £  *i(y)  •  q 


i  =  l 


=  d(y) 

Q.E.D. 
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As  an  easy  consequence  of  this  theorem,  the  following  corollary  can  be 
obtained. 

Corollary  I: 

(1)  If  ©6  {  M,  D  }  and  F  denotes  the  appropriate  extended  real 
value  set  for  the  operation  ©,  then 

a©t=   f    (a©t(i))  +  Cj  (13) 

i=l 

(2)  If  ©  e  {  ©,  ®  }  and  F  denotes  the  appropriate  extended  real 
value  set  for  the  operation  ©,  then 

a©t  =   f    (a©t(i))  -  ct  (14) 

i=l 

Proof:  The  proof  is  identical  to  the  proof  of  Theorem  1  with  the 
appropriate  support  at  infinity  replacing    e/(t_) 

Q.E.D. 

Theorem  1  and  Corollary  I  show  that  the  general  convolution  operator  © 
can  be  implemented  by  using  simple  shifts,  global  pixel-wise  multiplication  (or 
addition)  using  the  scalar  q,  and  accumulating  the  results.   Figures  17  and  18  in 
Section  3.4  illustrate  this  observation.  We  conclude  this  section  with  another 
corollary  of  Theorem  1. 

Corollary  II:  If  ©  is  one  of  the  operations  of  Theorem  1  or  Corollary  I,  and 
y  =  X:  is  an  element  of  the  support  of  t  ,  then 


a©t  =  (aOCj)  7 


T  (aet^OCi 


(15) 
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Proof:   Since  t(j)  is  the  identity  template,  a:  =  a$ t(j)  =  a.  Therefore, 
a  ©  t(j)  O  Cj  =  a  O  Cj.  The  result  now  follows  from  the  commutativity  of  7. 

Q.E.D. 
Corollary  II  is  important  in  that  it  saves  one  convolution  if  the  condition  y  =  Xj 
is  satisfied. 

3.3.2  Proof  of  Phase  Only  Filter  Application 

Phase  only  filter  application  is  of  special  interest  when  spatial  light 
modulators  are  used  to  provide  matched  filtering  in  optical  correlation.  A  SLM 
can  only  provide  amplitude  or  phase  information  unlike  a  holographic  film  plate 
which  can  provide  both  simultaneously  [39]. 

Aside  from  the  correlation  operation,  the  main  objective  of  the  optical 
processor  described  in  this  dissertation  is  to  simply  shift  the  image.  We  must 
show  that  the  phase  only  operation  of  the  SLM  will  be  sufficient.  In  order  to  prove 
sufficiency,  let 

{x1(y),x2(y),...,xn(y)}=  <^(ty). 

For  each  k  =  1,  2,  •  •  • ,  n,  there  exists  a  unique  vector 

z(k)  =  (jy(k),  *2(k))  6  P, 

such  that  xk(y)  +  z(k)  =  y.   Since  t  is  translation  invariant,  then  for  any 
arbitrary  vector  p  e  I2,  xk(y  +  p)  =  xk(y)  +  p,  where 

{x^y+p),  x2(y+p),.  •  -,  xn(y+p)}  =  <ty+p). 
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Thus, 


*k(y+p)  +  <±)  =  My)  +  *(k))  +  p  =  y  +  p. 


(16) 


Equation  16  simply  says  that  z(k)  is  not  affected  by  translations. 

For  each  k  =  1,  2,  •  •  • ,  n,  consider  the  unit  impulse  response  function 


*M  = 


1  if  x  =  z(k) 
0  if  x  i  z(k) 


Then  8^  can  be  used  to  provide  another  representation  of  the  one-point  template 
t(k)  by  observing  that   ^(y-x)  =  t(k)y(x),  V  y,  x  e  P.  This  follows  from  the 
fact  that 


*k(y-x)  = 


lifx  =  z(k) 
0  if  x  +  z(k) 


and  the  observation  that  y  —  x  =  z(k)  <=>  x  =  xk(y). 

Next,  recall  from  the  proof  of  Theorem  1  that  if  ak  =  a  ©t(k),  then  ak(y) 
=  a(xk(y)).  However,  we  also  have  that 


a(y)  ©  Sk(y)  =  ^  a(x)  •  ^(y-x)  =  a(xk(y)), 


(17) 


xGI 


where  ®  denotes  the  convolution  product.  Therefore, 


a©^  =  a®t(k) 


(18) 
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or,  equivalently, 

ak  =  a®^.  (19) 

In  the  frequency  domain  we  have,  according  to  the  convolution  theorem  [36],  the 
equivalent  formulation 

S{ak}  =  S{a®«k}  =  S{a}.S{^k},  (20) 

where  §  denotes  the  Fourier  transform  operation. 

Employing  the  notation   £k  =  S{^},  we  have  that 

MM  =  g*,  ^  £  Uu)  e-2"(W«  +  Vl")  (21) 

_    -27ri(2;1(k)u/M  +  z2(k)v/N) 

for  all  (u,i;)  e  I2.  Thus,   £k  has  constant  magnitude   1 2>k(ti,-y)  |  =1  and  phase 
<j>{u,v)  =  e-27ri(zi(k)u/M  +  *2(kWN);  i.e.,  only  phase  is  a  factor  in  the 
multiplication  of  §{a}  in  Equation  20. 
Since 

Hd)  =  a(xk(y))  =  <tMV),  (22) 

it  also  follows  from  Equations  19,  20,  and  21  that 
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a(y-z(k))  m  i(u,v)  e-MtiW"/*  +  *00*/i)f  (23) 

where  the  double  arrow  is  used  to  indicate  the  correspondence  between  a  function 
and  its  Fourier  transform.   Equation  23  expresses  the  well-known  translation 
property  of  the  Fourier  transform  [36].  Therefore,  multiplying  the  Fourier 
transform  of  the  image  a  by  the  exponential  equivalent  of  />k  and  then  taking 
the  inverse  Fourier  transform,  shifts  the  image  by  the  vector  z(k),  that  is,  it 
moves  the  origin  of  the  image  to  (,z1(k),22(k)).  Equation  23  also  expresses  the  fact 
that  a  shift  of  the  image  does  not  affect  the  magnitude  of  its  Fourier  transform. 
As  a  final  observation,  note  that  Equations  18,  22,  and  23  imply  that 

a  8  t(k)  <=»  a  •  Sk  ; 

i.e.,  shifting  the  image  in  the  spatial  domain  with  the  use  of  the  image  algebra 
operation  a  ©  t(k)  is  equivalent  to  the  product  a  •  8^  in  the  frequency  domain. 

3.4  Performance  Evaluation 

3.4.1  Example  Operation 

Figures  17  and  18  illustrate  the  processing  steps  involved  in  performing  a 
general  convolution  and  an  additive  maximum,  using  a  Von  Neumann  type 
template,  respectively.  Notice  however,  that  by  proper  selection  of  operation 
pairing  (i.e.;  •  with  A,  and  +  with  A)  multiplicative  minimum  and  additive 
minimum  are  produced  in  the  same  fashion. 
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GIVEN: 

INPUT  IMAGE 


TEMPLATE 
t 

2 
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a©t 
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Figure  17.  Example  of  general  convolution. 
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GIVEN: 

INPUT  IMAGE 


0  0  0  0  Oj 
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Figure  18.   Example  of  additive  maximum. 
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3.4.2  Comparison  of  Execution  with  RISC  Performance 

There  are  currently  two  limiting  factors  concerning  the  proposed 
architectural  design.  The  major  limitation  is  the  switching  speed  of  the  spatial 
light  modulators.  The  best  SLMs  in  the  research  community  today  will  operate  at 
a  10-40  KHz  switching  frequency  with  a  potential  for  MHz  [5].   This  is  well  below 
the  expected  operating  speed  of  the  remaining  digital  components,  a  common  25 
MHz  clock  frequency.  The  other  limitation  is  the  memory  I/O  speed  which  will  be 
necessary  to  load  in  parallel  the  SLMs  and  the  digital  array  processors.   High  speed 
circuits  such  as  cross-bar  switching,  block  memory  transfer,  and  column  parallel 
loading  networks  do  exist  for  this  application,  however,  the  physical  connections 
become  a  problem  as  the  number  of  elements  in  the  SLM  increases.   Faster  optical 
addressing  techniques  which  tend  to  reduce  this  limitation  are  being  developed 
[20].  The  40  KHz  switching  frequency  will  be  the  anticipated  limiting  factor  for 
the  rest  of  the  discussion.  It  will  determine  the  time  spent  in  each  section  of  the 
pipeline  (refer  to  Figure  12). 

Since  the  proposed  design  is  dependent  upon  the  size  of  the  template 
configuration  and  independent  of  image  size,  the  following  equation  can  be  used  to 
estimate  the  computation  time  for  a  convolution  type  operation: 

time  =  (n  +  J)  *  cI5ck"  >  (24) 

where  n  is  the  number  of  points  in  the  template  support,  and  clock  corresponds 
to  40  KHz.  The  addition  is  the  last  stage  of  the  pipeline,  the  beginning  of  the 
pipeline  bypasses  the  analog  processor  as  noted  at  the  end  of  Section  3.3.1.  Table 
5  show  the  results  for  5  and  25  element  templates. 
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The  following  equation  can  be  used  to  estimate  the  computation  time  for  a 
naive  calculation  of  a  template  convolution  over  an  N*N  image  using  a  Reduced 
Instruction  Set  sequential  Computer  (RISC): 

where  n  is  the  number  of  points  in  the  template  support  (n  multiplications  plus 
n— 1  additions),  the  2  in  the  denominator  represents  memory  interleaving,  clock  is 
the  operating  frequency  (25  MHz),  and  1  clock  cycle  per  operation.  Table  6  shows 
the  results  for  5  and  25  elements  over  various  images  sizes. 

3.4.3  Algorithm  Description  and  Evaluation 

The  max-min  sharpening  transform  is  selected  as  a  representative 
algorithm.  Such  an  algorithm  is  typically  used  as  a  preprocessing  filter  for  the 
subsequent  correlation  action.  The  transform  is  an  image  enhancement  technique 
that  brings  fuzzy  gray  level  objects  into  sharp  focus.  This  is  an  iterative  technique 
that  compares  the  maximum  and  minimum  gray  level  values  of  pixels  contained  in 
a  small  neighborhood  about  each  pixel  in  the  image.  The  results  of  the  comparison 
establishes  the  respective  pixel  output.  The  benefit  of  its  use  as  a  preprocessing 
step  is  that  a  sharper  image,  when  presented  to  the  correlator,  results  in  a  higher 
signal  to  noise  ratio  on  the  correlation  plane.  Thus,  such  an  image  has  a  higher 
probability  of  identification. 

3.4.3.1  Description.  The  image  algebra  formulation  is  defined  as  follows: 
Let  a  G  R    be  the  input  image,  and  iV(x)  a  predetermined 
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Table  5.   Estimated  execution  time  for  proposed  design- 
Number  of  elements  Time(seconds) 

5  1.5  x  lO"04 

25  6.5  x  10"04 


Table  6.   Estimated  execution  time  for  RISC  design- 
Number  of  elements  Time(sec)  per  image  size 


25 


7.4  x  10"4 

64x64 

0.003 

128x128 

0.012 

256x256 

0.047 

512x512 

0.189 

1024x1024 

0.004 

64x64 

0.016 

128x128 

0.064 

256x256 

0.257 

512x512 

1.028 

1024x1024 
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neighborhood  function  of  iEl;  and  t :  P  — »  R .    be  defined  by 


t*M  = 


0       if    x   €  JV(x) 
-co    otherwise 


The  output  image  s  is  the  result  of  the  max-min  sharpening  transform 
given  by  the  following  algorithm: 


aM:=aflt 


a_  :=  aQ  t 

m 


b  :=  a,,  +  a^-2  •  a 

M       m 


s:=X<0(b)-aM+x>0(b)-am) 

* 
where  t  is  the  additive  dual  of  t.  For  this  specific  implementation,  both  the 

input  and  the  output  images  are  defined  on  the  point  set  X  which  corresponds  to 

the  coordinate  points  of  the  CCD  array.  Therefore,  points  which  lie  outside  X 

are  of  no  computational  significance.  Also,  the  support  of  t    and  t  are 

equivalent;  i.e.,  ty(x)  =  ty(x),  V  x  e  e^(ty).  The  algorithm  is  now  iterated  either 

a  predetermined  fixed  number  of  times  or  until  s  stabilizes,  that  is,  until  there  is 

no  change  in  gray  value  from  one  iteration  to  the  next. 

Figure  19  depicts  a  typical  input  stream  for  the  instruction  to  execute  the 

above  algorithm.  An  instruction  stack  in  prefix  notation  is  used  to  illustrate  the 

instruction  stream.  This  is  a  simplification  of  what  would  normally  take  place  in  a 
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NOTE:  Image  Planes  are  formed,  point  wise,  using  the 
local  memories  at  each  accumulation  processor 
(refer  to  Figure  15). 
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Figure  19.   Example  of  instruction  stack  reordering. 
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series  of  translations  from  the  high  level  algebraic  abstraction  to  the  machine 
instruction  level. 

After  an  initial  scan  of  the  stack,  the  stack  operations  can  be  separated  into 
general  matrix  production  type  operations  and  those  that  are  strictly  pointwise. 
By  sequentially  ordering  these  classes  of  operations  (general  matrix  operations 
first)  and  further  noting  the  number  of  repetitive  operation  substrings,  the  stack 
can  be  reconfigured  (refer  to  Figure  19)  to  execute  with  a  minimal  number  of  clock 
cycles  per  operation. 

3.4.3.2  Evaluation.   Let  the  switching  frequency  of  the  SLMs  in  the 
correlator  operate  at  40KHz,  use  10  clock  cycles  as  an  average  for  each  digital 
operation  in  both  the  weighting  and  accumulation  processors,  and  use  a  common 
25MHz  clock  frequency.   Then,  assuming  the   card(t)  =  5  and  using  Equation  24, 
the  time  to  compute  the  two  operations  0  and  0  is  0.0003  seconds.  The 
remaining  eight  pointwise  operations  are  executed  in  3.36xl0"06  seconds.  The 
total  time  is  now  0.00030336  seconds.  Note,  if  the  assumption  of  10  clock  cycles 
per  operation  is  raised  to  100,  the  total  time  of  execution  becomes  0.0003336 
seconds.  This  is  a  result  of  the  40KHz  switching  frequency  which  is  much  lower 
than  the  common  clock. 

A  comparison  with  a  sequential  RISC  type  machine  is  made  in  a  similar 
manner  to  the  comparison  in  the  previous  section.   An  arbitrary  image  size  of 
256x256  pixels  is  used  to  facilitate  the  computation.   Assuming  once  again  a  25 
MHz  common  clock  and  one  operation  per  clock  cycle,  Equation  25  is  used  to 
evaluate  the  timing  for  the  two  general  matrix  operations.  The  total  time  for  both 
operations  is  0.0236  seconds.  The  remaining  operations  are  evaluated  by 
modifying  Equation  25,  where  the  2n— 1  parameter  is  replaced  by  the  number  of 
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operations,  in  this  case  8.  The  time  for  these  eight  operations  (0.0105  seconds) 
plus  the  general  matrix  operations  equals  a  total  execution  time  of  0.0341  seconds. 

3.4.3.3  Conclusions.   The  gain  in  performance,  or  the  speedup,  is  measured 
as  the  ratio  of  the  best  sequential  execution  time  to  the  parallel  execution  time. 
For  this  comparison  the  configuration  design  estimate  to  the  RISC  estimate  is 
approximately  112  to  1. 

A  more  important  measure  is  based  on  the  requirement  of  the  sensor.  For 
most  applications  the  minimum  frame  rate  for  a  system  to  update  a  guidance  law 
is  30  Hz;  other  higher  performance  systems  are  at  a  100  Hz  rate.   Considering 
the  simplicity  of  this  algorithm  and  the  fact  that  it  represents  only  a  small  portion 
of  what  must  be  computed  within  a  required  frame  rate.   The  comparison  just 
given  makes  an  important  point;  the  algorithm  cannot  be  executed  once  within  the 
minimum  frame  rate  of  30  Hz  using  sequential  RISC  technology.   On  the  other 
hand,  the  algorithm  can  be  executed  on  the  proposed  design  approximately  98 
times  per  frame.   Clearly  there  exists  a  substantial  increase  in  throughput  for  the 
proposed  design.   The  proposed  design  just  breaks  the  speed  barrier  necessary  to 
make  it  functional  in  an  ATR  system.    The  interesting  speculation  is  how  the 
throughput  would  increase  with  better  performance  from  the  SLMs. 

3.5  Binary  Image  Operation 

The  hybrid  electro-optical  system  described  thus  far  is  also  usable  as  a 
binary  image  correlator/processor.  The  primary  reason  for  doing  so  is  to  achieve  a 
significant  increase  in  the  processor  throughput.  The  entire  template  configuration 
is  convolved  using  the  optical  process,  thus  eliminating  any  iterative  solution.  The 
computational  time  complexity  in  this  case  is   0(1).  The  disadvantage  is  if  the 
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input  image  information  is  restricted  to  binary  images,  then  more  sophisticated 
algorithms  that  turn  binary  images  into  gray  scale  images  cannot  be  utilized 
efficiently.  With  regard  to  the  ATR  problem  for  specific  applications  that  require 
very  high  speed  image  processing,  the  advantage  may  outweigh  the  disadvantage. 

3.5.1  Functional  Description 

The  physical  layout  of  the  correlator  may  remain  unchanged  from  the 
description  in  Section  3.1  depending  upon  how  the  images  obtained  by  the  sensor 
are  used.  The  only  change  to  the  correlator  is  a  replacement  of  the  first  8-bit, 
SLM  with  a  two  state  or  binary  SLM.  In  this  case  the  sensing  device  establishes 
the  necessary  threshold  to  provide  the  binary  input  image.   Otherwise,  an  initial 
gray  level  image  is  processed  by  the  proposed  design  allowing  the  X-^n  operation 
to  provide  the  binary  image  which,  in  turn,  is  fed  back  to  the  first  SLM.  Again, 
depending  on  the  requirement,  replacing  the  SLM  will  provide  a  higher 
throughput. 

With  the  assumption  that  a  binary  image  is  passed  as  the  input  image,  and 
the  weights  in  any  arbitrary  template  configuration  are  all  set  to  1,  then  the  phase 
information  for  the  complex  conjugate  of  the  entire  template  configuration  is 
placed  on  the  second  SLM  and  processed  (refer  to  Figure  8).  The  intensity  of  the 
energy  emerging  from  the  analyzer  contains  the  results  of  the  convolution.  At  this 
time  the  convolved  information  is  transformed  by  the  CCD  array  into  a  gray  level 
image.  The  maximum  gray  level  value  is  directly  proportional  to  the  number  of 
times   1  values  overlapped  in  the  convolution  process,  and  it  is  bounded  by  the 
total  number  of  non  zero  elements  in  the  template.  Figure  20  depicts  this 
operation. 
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3.5.2  Image  Algebra  Process  Distribution 

The  simplicity  of  operation  of  this  design  configuration  stems  from  the  way 
the  morphological  operations  are  derived.  In  [63]  Ritter  et  al.  shows  that  a 
dilation  of  a  binary  image  a  with  a  structuring  element  b  is  equivalent  to  a©t, 
where  b  =  c/(ty)  and  ty(x)  =  1  for  all  x  6  e^(t_).  Similarly,  a  binary  erosion  is 
equivalent  to  a  ©  t.  By  using  the  convolution  operator  ©,  Davidson  [22]  proves 
that  the  morphological  operations  of  dilation  and  erosion  for  a  binary  image,  a,  are 
derived  by  the  following  expressions: 

dilation  >U 

and 

a      .     =X_n(aet),  (27) 

erosion  — '* 

where  n  =  the  number  of  elements  in    e/(t  )  and  a,  a  ,  and  a  are 

*  dilation  erosion 

Boolean  images. 

The  Boolean  operations   or  and  and  are  used  interchangeably  with  the 
set -theoretic  operations  union  and  intersection.  The  remaining  set -theoretic 
operation  necessary  for  the  binary  image  algebra  defined  by  Huang  et  al.  is  the 
complement  operation.   As  notated  in  Table  2,  this  operation  is  easily 
accomplished  in  hardware  using  a  flip-flop  at  the  appropriate  circuit  location,  as 
shown  in  Figure  14. 

The  operations  just  described  are  only  valid  when  the  weights  of  the 
template  are  all  set  to  unity.  If  template  weights  other  than  1  are  required,  then 
the  convolution  operations  can  be  accomplished  in  the  iterative  manner  as 
described  in  Section  3.3.  The  use  of  a  census  template  is  an  example  of  this 
situation.  Here  the  weights  of  the  template  are  distinct  powers  of  a  prime  number. 


GIVEN: 
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Figure  20.  Binary  image  erosion  and  dilation. 
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When  the  template  is  convolved  with  a  binary  image  the  result  is  a  unique  number 
for  each  distinct  pattern  of  Os  and  Is  [22].  This  is  a  fundamental  step  in  most 
algorithms  used  for  labelling  objects  in  binary  images. 

The  conclusion  to  draw  here  is  that  by  allowing  the  manipulation  of  gray 
level  values  formed  by  the  convolution  of  binary  images  or  binary  image  and  gray 
level  template,  Huang's  binary  image  algebra  described  in  Chapter  2  is  extended 
[41,67].  The  extension  is  for  both  the  set  of  values  and  the  number  of  operations 
over  the  set;  the  set  {1,0}  is  replaced  by  ffcK,  and  the  characteristic  operation  x 
and  general  convolution  operation  ©  are  included  to  provide  erosion  and  dilation 
operations  more  directly. 

3.6  Variant  Template  Operations 

Referring  to  Section  2.1.4,  the  definition  of  a  variant  template  is  one  whose 
configuration  and/or  values  are  allowed  to  vary  with  respect  to  the  reference  point 
location.  The  Fourier  and  Gabor  templates  derived  from  the  kernels  of  the 
discrete  Fourier  and  Gabor  transforms  are  examples  of  variant  templates.  The 
implementation  of  algorithms  involving  variant  template  operations  using  the 
conceptual  system  described  in  Section  3.2  can  result  in  significant  reductions  in 
system  throughput  for  some  specific  variant  templates. 

The  generalized  matrix  product  relative  to  parallel  architecture 
communication,  as  shown  by  Wilson  [74],  unifies  the  three  concepts  of 
many-to-one,  one-to-many,  and  one-to-one  data  transfers;  i.e.,  reduction, 
replication,  and  permutation  respectively.   The  first  two  concepts  are  used  to 
explain  the  throughput  differential. 
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The  image-to-template  operations  of  the  proposed  system  as  defined  in 
Table  4,  are  intended  for  invariant  template  operations,  and  can  be  interpreted  as 
the  generalized  left  product  of  image  a  with  template  t.  This  is  defined  for  all  a, 
b  6  IF     and  t  6  (F  )     such  that  for  any  reference  point  y  6  X 


b(y)=  r  (  a(x)  Ot^y)  ). 


In  short,  the  image  is  shifted  with  respect  to  the  location  of  each  template  element 
in  the  support  of  t;  such  that,  for  each  shift,  each  image  pixel  value  is  combined 
under  the  O  operation  with  the  corresponding  template  value.  This,  according  to 
Wilson,  is  a  one-to-many  data  transfer,  or  replication,  operation  [74].   As  noted 
earlier,  the  time  complexity  is   0(N)  where  N  =  cara\i);  i.e.,  card[a)  parallel 
operations  repeated  N  times. 

The  definition  of  the  variant  template  is  dependent  on  the  reference  point's 
location  relative  to  each  point  in  the  image,  therefore,  for  the  proposed  system 
only  the  generalized  right  product  is  interpreted.   That  is,  for  each  reference  point 

yex 


b(y)=  T(a(x)Oty(x)). 
xei 


In  this  case  the  template  is  shifted  with  respect  to  the  location  of  each  pixel  in  the 
image  a  such  that,  for  each  shift,  each  variant  of  the  template  description  is 
combined  under  the  O  operation  with  the  corresponding  pixel  values.  In  other 
words,  a  variant  template  is  loaded  as  an  image  and  the  image  pixel  value 
corresponding  to  this  particular  template  is  loaded  in  the  template  register  (refer 
to  Figure  13).  This,  according  to  Wilson,  is  a  many-to-one  data  transfer,  or 
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reduction,  operation  [74].  The  time  complexity  is  0(N),  where  in  this  case 
however,  N  =  card(a);  i.e.,  card{  e^(t  ))  parallel  operations  repeated  N  times 
where  card{  e^(ty))  varies  as  y  varies.  This  time  difference  is  significant,  two  or 
three  orders  of  magnitude  in  most  applications. 

There  are  methods  that  can  be  applied  to  the  variant  templates  themselves 
that  tend  to  increase  the  throughput.   Gader  has  shown  template  decomposition 
and  localization  methods  for  mesh  architectures  that  result  in  faster  computation 
[30,31].  Also,  Manseur  [50]  has  proven  that  for  certain  variant  template 
definitions  (e.g.;  t,  column  rank  one)  there  exists  an  invariant  vertical  template  tt 
and  a  (possibly  variant)  horizontal  template  t2  such  that  t  =  tt  ©  t2.   Thus,  the 
overall  problem  can  be  reduced  to  a  linear  combination  of  templates  where  at  least 
a  portion  of  the  computation  can  be  executed  at  a  higher  rate. 

Although  the  methods  just  mentioned  result  in  an  increase  in  throughput, 
the  conclusion  for  their  use  remains  with  the  specific  implementation.  Assuming 
the  implementation  is  still  the  correlator  interface  for  ATR;  if  the  frame  rate 
(update  rate)  of  the  system  is  high  (above  30  Hz),  then  the  use  of  variant 
templates  in  any  acquisition  algorithms  should  be  avoided. 


CHAPTER  4 
VANDER  LUGT  SYSTEM  IMPLEMENTATION 


The  hybrid  electro-optical  system  described  here  is  also  based  on  the  notion 
of  integrating  Fourier  optical  correlation  and  digital  image  processing.  The 
imaging  media,  holographic  film,  makes  this  implementation  unsuitable  as  a 
solution  to  the  ATR  problem  defined  in  Chapter  2.  It  does,  however,  have  direct 
application  for  many  image  processing  tasks  that  use  this  high  resolution  imaging 
media;  e.g.,  photo  reconnaissance,  industrial  robotics,  blood  analysis,  etc. 
Moreover,  the  reason  for  including  it  in  this  dissertation  is  that  the  image  algebra 
interface  with  a  Vander  Lugt  optical  system  is  noteworthy  in  general.  The 
interface  affords  additional  methods  for  feature  extraction  which  complement  the 
correlation  function. 

4.1  System  Description 

The  difference  between  this  configuration  and  the  one  described  in  Chapter 
3  is  the  imaging  media,  spatial  light  modulators  are  replaced  by  holographic  film 
for  both  image  and  matched  filter  inputs.  Also,  the  holographic  film  is  capable  of 
storing  both  the  magnitude  and  phase  of  the  Fourier  transform.  This  contrasts  the 
phase-only  SLM  implementation.  The  method  of  producing  a  matched  filter  on 
holographic  film  is  based  on  the  work  of  Vander  Lugt  [72],  hence  the  correlator 
implementation  is  commonly  referred  to  as  a  Vander  Lugt  system. 
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The  advantages  of  using  this  method  are:  the  capability  to  statistically  and 
repeatedly  examine  large  numbers  of  images  for  the  presence  of  specific 
information  content,  the  resolution  of  the  film  is  higher  than  that  of  the  SLM,  and 
the  overall  system  cost  is  lower.  The  disadvantages  are  that  the  template 
configuration  and  values  must  be  known  a  priori,  thereby  eliminating  program 
controlled  dynamic  reconfiguration;  the  procedure  for  exchanging  filters  is 
mechanical;  and,  the  space  bandwidth  product  tends  to  be  large.   The  latter  results 
in  an  increase  in  the  physical  size  of  lenses,  film  plates,  etc. 

The  image  algebra  algorithm  suite  for  the  kinds  of  processing  tasks 
mentioned  can  be  executed  using  a  limited  number  of  predefined  templates,  or 
filters.   Predefined  templates  will  allow  for  a  system  of  selectable  multiple  space 
and  frequency  multiplexed  fixed  filter  sets  similar  to  the  one  proposed  by  Casasent 
and  Schaefer  [14,15].   The  technique  involves  a  finite  number  of  templates  stored 
on  film  and  multiplexed  using  a  selectable  laser  diodes  array  (see  Figure  21).  By 
selecting  the  proper  diode,  the  focal  point  on  the  holographic  film  plane  is 
translated  to  the  desired  template  and  convolved  in  the  manner  described  in 
Chapter  3.  The  necessary  composite  matched  filters  or  synthetic  discriminant 
functions  are  included  on  the  film  containing  the  templates  [13,44].  The  remaining 
digital  circuitry  is  the  same  as  that  defined  in  Section  3.2  with  the  only  exception 
being  the  absence  of  any  feedback  to  the  input  image.  This  proposed  system 
configuration  allows  for  some  degree  of  template  programmability  and  eliminates 
any  mechanical  exchange  action. 

Typically,  the  number  of  non  zero  template  elements  for  template 
configurations  used  in  image  algebra  operations  tends  to  be  small  compared  to  the 
number  of  non  zero  elements  used  by  matched  filter  correlators.   Therefore,  the 
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space  bandwidth  product  that  is  necessary  for  cross  correlation  is  sufficient  for  the 
convolution  type  operations  of  the  image  algebra. 

4.1.1  Filter  Generation 

The  2-dimensional  Fourier  transform  of  a  continuous  function  f(x,y)  is 
defined  as 

00 

9{f(x,y)}  =  F(u,v)  =[[  f(x,y)  e"2i7r(  xu  +  *v)  dx  dy, 

-00 

and  in  a  similar  manner  the  function  h(x,y).   The  convolution  and  correlation  of 
f(x,y)  and  h(x,y)  are  further  defined  as 

f(x,y)  ®h(x,y)  =  F(u,v)  •  H(u,v)  and 
f(x,y)Oh(x,y)  =  F(u,v)  •  H*(u,v), 

where  ©  denotes  the  convolution,  O  denotes  correlation,  and  *  represents  the 
complex  conjugate  [36]. 

If  the  reference  image,  h(x,y),  is  placed  on  a  transparent  film  and 
illuminated  by  a  plane  wave  of  coherent  light,  then  its  Fourier  transform  is  formed 
at  the  focal  plane  using  a  simple  lens.  However,  recording  film  located  at  the  focal 
plane  can  only  record  the  power  spectrum  of  the  Fourier  transform  [33].  The 
Fourier  transform,  as  opposed  to  the  phase-only  operation  previously  described,  is 
needed  in  this  system  to  remove  or  attenuate  specific  frequency  components  in  the 
image.  Therefore,  a  complex  filter  H(u,v)  is  needed  to  perform  the  necessary 
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filtering  operations.  Two  pieces  of  information  are  required  to  produce  such  a 
filter,  namely  the  real  and  the  imaginary  parts  of  the  waveform. 

Vander  Lugt  proposed  a  heterodyning  technique  that  incorporates 
holographic  film  to  generate  and  store  the  necessary  complex  waveform  [72].  A 
simple  lens  is  used  to  transform  the  reference  image  that  corresponds  to  the 
template.  The  Fourier  transform  of  the  reference  image,  §{h(x,y)},  is  brought  to 
focus  on  a  photographic  film  plane.  A  plane  wave  is  directed  at  the  film  plane  at 
an  angle  0  and  mixed  with  the  transform  at  the  film  plane  as  shown  in  Figure  22 
[12].  The  resultant  pattern  caused  by  the  constructive  interference  of  the  two 
fields  is 


2ifl"av 


Ha(u,v)  =  H(u,v)  +  A  e 
where  a  =  ^ 


is  referred  to  as  the  spatial  carrier  frequency,   A  the  wavelength  of  the  light,  and 
A  and  e2l7rav  are  the  amplitude  and  phase  of  the  reference  wave,  respectively. 
For  most  applications,  A  is  of  little  interest  and  is  set  to  unity.  The  film  records 
not  the  field  itself  but  rather  the  square  magnitude  of  the  field,  thus 

Ha(u,v)  =  l+  |H(u,v)|2 
+  H(u,v)  e2i7rav  +  H*(u,v)  e-2i7rav, 

contains  a  constant,  the  power  spectral  density,  and  two  terms  due  to  the  presence 
of  the  spatial  carrier  frequency.  The  two  spatially  modulated  terms  contain  the 
Fourier  transform  information  [72]. 
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Both  the  magnitude  and  phase  of  the  Fourier  transform  are  co-located  on 
the  holographic  film;  typically  a  dichromated  gelatin  (DCG)  which  is  a  high 
quality  material  for  recording  matched  filters  [10,64].    Computer  generated 
holograms  written  to  bleached  silver  halide  plates  is  another  recording  method 

[12]. 

4.1.2  Filter  Implementation 

The  film  can  now  be  placed  at  the  Fourier  transform  lens  focal  plane  and 
illuminated  with  the  Fourier  transform  of  the  test  image,  §{f(x,y)}.  This 
corresponds  to  the  filter  bank  location  as  shown  in  Figure  21.  The  product  of  the 
film  transmittance  and  the  illuminating  Fourier  transform  is  the  function  G(u,v), 
where 


G(u,v)  =  F(u,v)  •  H  (u,v) 
a 


=  F(u,v)  +  F(u,v)  •  |H(u,v)|2 

+  F(u,v)  •  H(u,v)  •  e2i7Fav 

+  F(u,v)  •  H*(u,v)  •  e-2i7Fav. 

Another  lens,  L2  in  Figure  21,  is  used  as  an  inverse  Fourier  transform  of  this 
product  to  simultaneously  obtain  the  convolution  and  correlation  of  the  two 
images  [72],  denoted  by 


88 
g(x,y)  =  f(x,y)  +  [f(x,y)  ®h(x,y)]  ®h*(x,y) 

+  f(x,y)®h(x,y)  £(x,y+a) 

+  f(x,y)  ©h*(x,y)  tf(x,y-a), 

where  £(x,y±a)  is  simply  a  translation  function  and  h  (x,y)  =  h(— x ,— y). 

4.2  Image  Algebra  Process  Distribution 

The  distribution  of  image  algebra  operations  for  this  implementation  differs 
significantly  from  the  implementation  defined  in  Chapter  3.  The  shifting 
operation  using  the  Fourier  optics  and  a  sine  function  at  the  filter  location  is  still 
the  same  with  the  exception  that  the  sine  function  has  non-unit  amplitude.  The 
presence  of  the  amplitude  in  the  sine  function  eliminates  the  need  for  the  weighting 
processor  section  of  the  digital  side  of  the  hybrid  configuration.  This  allows  the 
complete  template  configuration  to  be  included  in  the  filter  definition.   Therefore, 
general  convolution  operations  are  completed  in  one  iteration.  Since  the 
magnitude  of  the  filter,  which  corresponds  to  the  value  of  the  template,  is 
inherently  multiplied  by  the  optical  convolution  properties,  the  morphological 
convolution  operations  are  restricted  to  the  multiplicative  maximum  and 
minimum  as  defined  in  Chapter  2.  The  accumulation  processor  section  remains 
the  same  with  the  exception  that  the  template  register  is  input  directly,  refer  to 
Figure  13.   All  pointwise  operations  are  accomplished  in  the  same  manner  as 
described  in  Tables  2  and  3. 
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4.2.1  Morphological  Operation 

The  objective  in  this  section  is  to  define  an  expression  which  allows  the 
multiplicative  nonlinear  convolution  operations  to  be  expressed  in  terms  of  linear 
convolution  operations.  The  expression  is  first  established  for  the  multiplicative 
maximum,  denoted  as  ®.   Once  the  expression  is  defined  for  0,  the  nonlinear 
operation  for  multiplicative  minimum  ®  will  automatically  follow. 

In  support  for  the  theorem  that  follows,  suppose  Fe{  R-^,  1  kU  {m}  }  and 
t  G  (F  )     denotes  a  translation  invariant  template  with  card{  es^t  ))  =  n. 

For  some  arbitrary  point  y  6  X,  let   {  x  ,  x  ,  •  •  • ,  x   }  =  e^(t  )  and  for 
i  =  l,  2,---,n,  set  c^tyfe). 

Define  a  parameterized  template  s(i)  from  X  to  X  with  defined 
parameter  i  e  {1,  2,>  •  -,n}  defined  by 


Ci,  if  x  =  x, 

s(i)yw  =  ; 

0,  otherwise. 


Note  that    e/(s(i)  )  =  {  x{  }  for  i  =  1,  2,  — ,  n. 


Theorem  2:  aOt  =  \A  a8s(i)  (28) 

i=l 

PROOF:  For  i  =  1,  2,  •  •  • ,  n,  let  af  =  a  ©  s(i).   Thus,  for  an  arbitrary  point  y  G  X, 
we  have 


a,(y)  =  X  a(x)  •  s(i)y(x) 


xei 
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2,        a(x)  •  s(i)y(x) 

X6<^(s(i)y) 


since    e/'(s(i)y)  =  {xj.  This  shows  that 

ai(7)  =  a(Xi)  •  Ci  (29) 

Now  if  b  =  a  ©  t,  then  for  an  arbitrary  point  y  G  X  we  have,  by  the  definition  of 
the  convolution  operator  ©,  that 


b(y)  =  Y  a(x)  •  ty(x) 


xei 


=         y    a(x)-ty(x) 
««^(ty) 


=  Y  "W "  VXi) 

i=l 


=  Y  *w  Ci 


i=l 
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=  Y  ai(y)        (by  Equation  29). 


i=  1 


Therefore,  a  ©  t  =  W  at  =  W  a  ©  s(i) 

i  =  l  i=  1 

Q.E.D. 


As  a  corollary  to  Theorem  2,  the  multiplicative  minimum  ®  is  simply 


a©t  =  A  ■  a©s(i).  (30) 


i=l 


Proof:  The  proof  is  identical  to  the  proof  of  Theorem  2  with  the  appropriate 
global  reduce  function  and  corresponding  induced  operation 

Q.E.D. 

4.2.2  Negative  Real  Value  Manipulation. 

Some  image  processing  algorithms  require  template  definitions  to  contain 
negative  values.  Edge  masks  using  finite  differences  are  examples  of  such 
templates.  The  Vander  Lugt  optical  system  described  above  uses  only  positive 
real  values.   The  following  method  is  presented  as  a  solution  to  this  condition. 

Using  the  relationship  previously  developed  for  ffi,  one  can  think  of  the 
notion  of  an  additional  sub  junction  of  positive  and  negative  values  which  when 
properly  combined  fulfill  the  stated  requirement.  Specifically,  the  generalized 
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convolution  ©  for  image  a  and  template  t,  using  the  parameterized  template 
s(i)  defined  in  Theorem  2,  can  be  defined  as 

a  e t  e  a  e  s(l)  +  a  e  s(2)  +  •  •  •  +  a  ©  s(n). 

Suppose  that  a  portion  of  the  values  in  these  single  valued  templates  are 
negative.  A  distribution  property,  proven  by  Gader  [30],  states  that  for  image  a  e 
F  and  templates  t  and  t   6  F , 


(affit )  +  (aet  )  =  ae(  t  +  t   ). 
r  r  i       2 


(31) 


Now,  each  real-valued  template  t  can  be  decomposed  into  two  templates 


t    and  t  ,  where 


tYM   =   ■ 


ryx)  if  ty(x)>o 

0         otherwise 


ty(x)  = 


ty(x)   if    ty(x)  <  0 
0         otherwise 


and  t  =  t   +  t .  According  to  Equation  31,  we  now  have  that 


aet  =  (aet  )  +  (aet ) 


(32) 


Since   1 t  (x)  |  =  —  t  (x)  for  all  x  and  y  in  the  domain  of  the  definition  of  t  , 


we  also  have  that 
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a(x)  •  |ty(x)|  +  a(x)  •  ty(x)  =  a(x)  [|ty(x)|  +  ty(x)] 


=  a(x)  [-ty(x)  +  ty(x)] 


=  0, 


or,  equivalently, 


a(x).  |t;(x)|=-a(x).t"(x). 


This  shows  that 


a$  |t    |  =-a8t  (33) 

Equations  32  and  33  imply  that 

aet  =  (aet+)-(ae|t~  |).  (34) 

Thus,  we  can  now  compute  each  convolution  a  ©  s(i)  in  terms  of  two  positive 

+ 
convolutions  a  ©  s  (i)  —  a  ©  |  s  (i)  | .  This  accommodates  the  explicit  requirement 

of  the  optical  system. 

In  this  final  form  the  analog  component  of  the  architecture  computes  the 

two  non-negative  convolutions  while  the  subtraction  is  carried  out,  point  wise,  by 

the  digital  component. 
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4.2.3  Binary  Operation 

The  operations  that  are  described  in  Section  3.5.2  are  directly  applicable  in 
the  Vander  Lugt  optical  system  configuration.  Therefore,  the  binary  image 
algebra  defined  by  Huang  et  al.  [41]  is  realized  completely. 

4.3  Conclusions 

The  concept  of  image  algebra  operations  on  a  Vander  Lugt  hybrid 
electro-optical  system  is  viable.  The  computational  time  complexity  for  the  linear 
convolutions  and  binary  operations  is   0(1)  while  the  gray  level  morphological 
operations  have  complexity   0(N),  where  N  is  the  number  of  elements  in  the 
template  support.  The  additional  flexibility  that  the  image  algebra  brings  to  this 
method  of  optical  computing  covers  a  wide  area  of  applications  and  provides  for  a 
far  more  powerful  optical  computing  system  than  that  proposed  by  Huang  [41]. 


CHAPTER  5 
ORDER  STATISTIC  FILTERING 


Order  statistics  (OS)  or  ranked  order  filters  constitute  a  general  class  of 
nonlinear  filters  used  in  digital  signal  processing  (DSP)  as  well  as  image 
processing.  The  median  filter  is  the  most  widely  used  filter  of  this  general  class. 
This  is  probably  due  to  the  simplicity  of  its  formulation  and  straightforward 
implementation.  The  use  of  the  remaining  OS  filters  continue  to  develop  as 
research  topics  [11,21,48,73]. 

The  reason  for  including  OS  filters  in  this  dissertation  is  twofold.  First,  the 
nonlinear  filtering  properties  are  ideally  suited  for  preprocessing  images  for  the 
ATR  correlator  operation  described  in  Chapter  2.  The  median  filter,  for  instance, 
can  suppress  and  quite  often  eliminate  impulse  noise  while  preserving  the  edges  of 
objects  in  an  image.  This  type  of  noise,  very  large  positive  or  negative  values  of 
short  duration,  is  predominant  in  active  systems  such  as  laser  and  synthetic 
aperture  radar.  Since  the  correlator's  composite  matched  filters  are  usually 
produced  from  edge  enhanced  images,  the  edges  must  be  preserved  to  provide  the 
highest  probability  of  a  match.   Linear  filters,  while  convenient  to  implement  in  a 
correlator  and  capable  of  removing  some  impulse  noise,  have  a  tendency  to  smooth 
or  even  smear  the  edge  information. 

Secondly,  a  new  method  is  presented  which  provides  the  OS  in  the  general 
case.  It  is  a  natural  extension  of  the  architectural  design  presented  in  Chapter  3. 
The  method  is  capable  of  generating  any  OS  at  essentially  the  same  cost  function 


95 


96 

as  a  single  convolution  operation.  Its  performance  with  respect  to  previous 
methods  is  discussed. 

5.1  Background 

An  OS  filter  is  a  simple  nonlinear  filtering  operation  in  which  a  template 
support  moves  over  an  image  in  a  manner  similar  to  a  convolution.  However, 
instead  of  a  sum  of  products  at  each  pixel,  the  ith  largest  pixel  value  within  the 
template  support  is  taken  as  the  output.  The  OS  filter  is  nonlinear  due  to  the 
ordering  process.  In  most  applications,  sorting  the  values  within  the  template  to 
achieve  the  ordering  introduces  an  undesirable  time  constraint.  Fast  algorithmic 
methods  have  been  demonstrated  for  the  median  filter  [6,42],  however,  these 
methods  are  sequential  and  not  usable  in  general  for  the  ordered  statistic  nor  are 
they  suitable  for  parallel  architectures. 

The  median  filter,  a  singular  case  of  the  OS,  was  first  used  as  a  smoothing 
technique  in  time  series  analysis  by  Tukey  [71].  As  stated  earlier,  the  median 
filter  application  has  been  investigated  extensively.  One  notable  investigation  was 
by  Gallagher  et  al.  concerning  the  root  signal  properties  of  medial  filters 
[3,27,32,73]. 

The  root  signal  is  defined  as  a  signal  which  is  invariant  to  the  median  filter. 
This  is  derived  by  a  succession  of  median  filters  of,  initially,  a  non  root  signal  up 
to  a  maximum  of  (L  —  2)/2,  where  L  is  the  length  of  the  signal  [3].   This  area  of 
research  lends  itself  to  filtering  techniques  such  as  oscillation  suppression,  gradient 
reduction,  and  selective  impulse  removal.  An  implementation  of  the  root  signal 
state  description  is  the  stack  filter  defined  by  Wendt,  Coyle,  and  Gallagher  [73]. 
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The  particular  method  of  signal  decomposition  used  for  the  stack  filter  was 
the  foundation  for  the  morphological  investigation  by  Maragos  and  Schafer  [51,52]. 
They  have  shown  that  not  only  the  stack  filter  concept  but  the  entire  OS  filtering 
concept  can  be  achieved  using  sequential  applications  of  morphological  operations. 
The  importance  of  this  mathematical  assessment  was  the  application  of  the  theory 
to  the  analog  domain  as  well  as  DSP. 

5.1.1  DSP  -  Stack  Filter  Approach 

The  concept  of  stack  filters  was  developed  as  a  signal  enhancement  method 
for  digital  signal  processing.  It  used  threshold  decomposition  and  a  stacking 
property  as  the  underlying  properties  for  the  technique. 

Threshold  decomposition  is  defined  as  a  decomposition  of  an  Af-valued 
input  signal  into  a  set  of  M—  1  binary  signals.   For  each  threshold  value  k  €  {1,  2, 
•  •  • ,  M—  1}  the  corresponding  binary  signal  will  record  a  1  where  the  signal  is 
greater  than  or  equal  to  k,  and  a  0  otherwise.  The  input  signal  can  be  fully 
recovered  by  adding  the  respective  binary  signals. 

The  binary  signals  can  be  OS  filtered  then  summed,  and  the  result  will  be 
the  same  as  the  output  signal  of  an  OS  filter  on  the  original  signal  as  shown  in 
Figure  23.  The  gain  in  doing  the  decomposition  is  that  OS  filters  for  binary 
signals  are  easily  constructed  using  binary  circuits  representing  Boolean 
expressions  of  sum-of-products  or  product -of-sums.  Furthermore,  the  binary 
circuits  can  be  arranged  in  parallel  and  easily  implemented  using  very  large  scale 
integration  (VLSI).  Instead  of  summing  all  the  binary  signals,  the  stacking 
property  is  used.  The  property  states  that  since  the  binary  signal  are  stacked 
according  to  each  threshold  level  such  that  each  column  will  consist  of  all  Os,  or 
all  Is,  or  a  group  of  Is  topped  with  a  group  of  Os,  the  desired  output  is  the  level 
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Input  Signal 


Output  Signal 


1  _ 


2_ 


110  2  3  3  12  2 


Threshold 
Decomposition 


Level 
3     000011000 

2     0  0  0  1110  11 

1      110  111111 


Median 
Filter 


Sr 


Binary  Median 
Filter  B 


Binary  Median 
Filter  B 


Binary  Median 
Filter  B 


1112  3  3  2  2  2 


Stacking 
Property  (Sum) 


0  0  0  0  110  0  0 


0  0  0  111111 


111111111 


B  (X1 ,  X2  ,  Xg  )  =  X.X_+  X..X.J+  Xp*3 


Figure  23.   Stack  filter  —  3  window  median  filtering. 
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just  before  the  transition  from  1  to  0.  This  is  conveniently  implemented  in  VLSI 
by  performing  a  binary  search  on  each  filter  output  in  bit-reversed  order. 

Upon  further  inspection,  this  "stack"  of  filters  can  be  any  binary  circuit 
representing  the  number  of  elements  in  the  filter  window.  With  this,  Wendt  et  al. 
[73]  concluded  that  these  filters  include,  as  a  subset,  all  the  ranked-order  operators 
and  also  represent  all  possible  compositions  of  morphological  filters  as  described  by 
Maragos  and  Schafer  [52]. 

A  general  threshold  function,  T    ,  ,  defined  in  terms  of  a  threshold  k  and 
weighting  vector  a  =  (av  a2,-  ■  •,  an)  for  input  vectors  of  length  n  is  given  by 


Ta,k(Ii'a:2."-'In)  = 


1   if    E   a,x,  >  k 

i=  1     x  1  ~ 

0  otherwise 


or,  simply,  by 


T(x)  =  1  *=>  a  •  x  >  k, 

where  x=  (xv  x2,---,xn). 

Now  given  a  threshold  function  T  with  integer  weights   av  a2,  •  •  • ,  aQ  and 
integer  threshold  k,  then  by  repeating  the  ith  input,  x{,  a{  times,  each  with  unit 
weights,  one  obtains  the  same  output  as  the  sum  of  the  Boolean  functions. 
Therefore,  the  stack  filter  5m  as  a  generalized  rank-order  filter  by 

Srp  (s1(  s2,-  •  • ,  sn)  =  Ath  largest  element  of  the  set 

{»!»••  ••  »n  s2,---,s2,---,sn,---,  sj,  (35) 
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where  Sj,  the  ith.  sample  in  the  window  of  size  n,  is  repeated  &i  times.  As  an 
example,  let  &i=  1  for  all  i,  then  5™  defines  an  (n— k)th  OS  filter.  If  n  is  odd 
and  k  =  [n/2j ,  then  S™  defines  a  median  filter  [73].  Tables  7-10  list  the  stack 
filters  and  their  corresponding  Boolean  functions  for  a  window  width  3.  The  filters 
are  grouped  according  to  the  roots  of  the  filters  they  produce.  Figure  24  shows  an 
example  of  what  Wendt  et  al.  refer  to  as  an  asymmetric  median  filter.  The  filter  is 
defined  in  Table  10  with  the  comment  that  it  eliminates  negative  going  impulse 
noise;  i.e.,  positive  going  impulse  noise  is  not  filtered  out,  as  illustrated  in  Figure 
24. 

The  stack  filter  concept,  with  its  Boolean  representation  and  a  low  number 
of  window  elements  (reportedly,  3  [73]),  is  well  suited  for  time  variant  DSP. 
However,  an  implementation  of  the  concept  with  the  architecture  described  in 
Chapter  3  is  impractical.  The  number  of  binary  filters  (Boolean  circuits)  must 
match  the  number  of  gray  levels,  256  in  most  applications.   Considering  the 
parallel  aspects  of  the  design,  Section  3.2,  this  large  number  of  binary  filters  would 
be  required  at  each  pixel. 

5.1.2  Morphological  Approach 

All  OS  filtering  operations  are  easily  defined  in  the  set -theoretic, 
morphological  algebra  (as  proven  by  Maragos  and  Schafer  [52]).  Included  in  the 
definition  is  a  straightforward  conversion  of  all  stack  filter  Boolean  equations  to 
maximum  and  minimum  set  operations. 

The  focus  in  the  remaining  discussion  is  on  the  following  question:  "Since 
the  algebra  of  mathematical  morphology  can  be  considered  as  a  subalgebra  of  the 
image  algebra  [22],  are  the  morphological  operations  for  OS  filtering  applicable  to 
the  architecture  described  in  Chapter  3?" 
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Table  7.  Stack  filters  which  preserve  only  constant  roots. 


5(xl,  x2>  x3) 


^j(Sl>  S2'  S3/ 


Comments 


0 

1 


X1X2 


xl+x3 
X1X2X3 


0 

M-l 

st 

S3 

min^,  s3) 

max(s1,  s3) 

min(s1,  s2,  s3) 

max(s1}  s2)  s3) 


Filters  every  input  to 

0. 

Filters  every  input  to 

the  Mth  value  minus 

1. 

Right -shift  operator. 

Left -shift  operator. 


minimum  filter 
maximum  filter 


Table  8.   Stack  filters  which  preserve  only  increasing  roots. 


£(xi,  x2>  xs) 


*  J\S1>  S2»  S3/ 


X2X3 

x1+x2 

X1+X2X3 
xlx3"t"x2X3 


min(s2J  s3) 

max(s!,  s2) 

2nd  largest  element  of 

lSl)  Sl>  S2»  S3J 

2nd  smallest  element  of 

lsl>  s2)  S3'  S3J 
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Table  9.  Stack  filters  which  preserve  only  decreasing  roots. 


B(xv  x2,  x3) 


Oj^Sj,  s2,  s3) 


X1X2 
X2+X3 

Xj+XjXj 
X1X2"'"X1X3 


mm(Si,  s2) 

max(s2,  s3) 

2nd  largest  element  of 

isl>  s2>  S3>  S3J 

2nd  smallest  element  of 

lsl>  sl>  s2>  S3J 


Table  10.  Stack  filters  which  preserve  median  filter  roots 
and/or  other  roots. 


#(xi,  x2>  xa) 

" y(si>  s2>  s3) 

Comments 

X2 

«2 

The  identity  function: 

X1X2"'"X1X3"'"X2X3 

median(s1,  s2,  s3) 

preserves  everything. 
The  median  filter. 

x2"'"xlx3 

2nd  largest  of 

Eliminates  negative 

\sl>  sii  s2>  S3J 

going  impulses. 

X1X2"'"X2X3 

2nd  smallest  of 

Eliminates  positive 

lsl>  s2>  S2»  S3J 

going  impulses. 
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B  (x1 ,  x?  ,  Xj  )  =  x.,x3+  *2 


Figure  24.   Stack  filter  —  3  window  asymmetric  median  filtering. 
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In  Reference  [52]  Maragos  and  Schafer  make  the  following  observation.  For 
any  function  /  and  any  finite  set  W,  the  fcth  OS  of  /  by  W,  k  =  1,2,.. .,n  = 
card(W),  is  equal  to  the  pointwise  maximum  of  the  moving  local  minima  of  / 
inside  all    j?|   windows  equal  to  the  subsets  of  W  containing  exactly  k  points, 
and  it  is  also  equal  to  the  minimum  of  the  moving  local  maxima  of  /  inside  all 
subsets  of  W  containing  exactly  n— k+1  points.  The  number  of  combinations  of 
n  items  taken  k  at  a  time  is  defined  as     ?    =  n!  /  [k!(n— k)!]  and  0!  =  1.  This 
observation  can  be  further  refined  as  the  Arth  OS  of  sets  by  a  window  W  being 
equal  to  the  maximum  of  erosions  by  all  the  subsets  of  W  containing  k  points. 
Likewise,  the  minimum  of  dilations  by  all  the  subsets  of  W  containing  n— k+1 
points.   A  clear  example  of  this  can  be  made  by  defining  the  median  filter  Sj,  for 
each  value  x  in  the  input  signal  /  of  Figure  23  as 

Sjif)  =  max{  min{/(x),/(x+l)},  min{/(x-l),/(x)}, 
min{/(x-l),/(x+l)}   }. 

What  is  readily  noticeable  and  defined  by  Maragos  and  Schafer  is  that  the 
set  theoretic  operations  are  obtained  by  replacing  the  and/  or  operations  of  the 
Boolean  equations  in  the  stack  filters  with,  respectively,  the  max/min  operations 
of  the  morphological  algebra  [52]. 

These  definitions  translate  directly  into  image  algebra  and  are  relative  to 
the  n-dimensional  image  space.  From  above,  using  image  algebra  notation  and 
typical  symbolism,  consider  an  image  a  e  F     and  a  window  W  of  cardinality  n. 
Let  W    denote  the  window  W  shifted  to  location  y  e  X  in  order  to  compute 
the  kth  order  statistic  of  a  at  pixel  location  y.  Note  that  W  defines  a 
template  t  €  (F^  )  ,  namely 
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tyM  = 


0    if    x  6  Wy 
+od  otherwi  s  e 


with    <,(ty)  =  Wy. 

For  i  e  {1,  2,-  •  •,  I?  },  let  W(i)  denote  a  subwindow  of  W  consisting  of 
exactly  k  points  and  such  that  W(i)  ^  W(j)  wherever  i  i  j.  Define  t(i)  6 


(Foo1)1  by 


t(i)y(x)  = 


0    if    x  e  W(i)y 
+od  otherwi  s  e 


Observe  that    <(t(i)y)  =  W(i)y  and,  therefore,   card{  <(t(i)y))  =  k. 
The  Ath  OS  of  a  can  now  be  defined  as 


L 

b  =  V  a  E3  t(i), 

i  =  1 

where  L  =    ,    .   An  example  of  a  median  filter  of  a  two-dimensional  image  is 
illustrated  in  Figure  25. 

The  image  algebra  operations  just  presented  are  executable  on  the 
architecture  described  in  Chapter  3;  however,  the  execution  time  complexity  as  a 
function  of  increasing  template  size  is   (9(N!)  and  unacceptable.  For  example,  to 
execute  a  median  filter  in  this  manner  using  a  common  3*3  template,  the  number 
of  additive  minimum  operations  is  196.   For  a  5*5  template  the  number  grows  to 
5,200,300.  Even  considering  the  application  for  a  multiple  instruction  multiple 
data  (MIMD)  computer  such  as  the  Connection  Machine  [37],  the  number  is 
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Figure  25.  Example  of  morphological  median  filter. 
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overwhelming.  The  observation  is  similar  to  that  for  the  stack  filter;  for  a  small 
number  of  window  elements,  it  is  well  suited  as  a  time  variant  DSP  filter. 

5.2  Stack  Comparator 

The  conclusion  from  the  analysis  of  the  preceding  section  is  clear.  In  order 
to  adequately  execute  OS  filtering  operations  on  the  proposed  architecture  of 
Chapter  3,  it  is  necessary  to  design  specific  hardware  that  can  provide  the 
intermediate  sorting  operation. 

There  are  many  methods  in  use  for  sorting  a  string  of  numerical  values 
[2,21,45,47].  The  methods  of  interest  are  the  ones  that  can  interface  directly  with 
the  accumulation  processor  defined  in  Section  3.2.   Sequential  methods  such  as 
quick  sort,  the  fastest  -  time  complexity   0(NlogN),  where  N  is  the  length  of  the 
string  of  values,  are  impractical  [45].  The  time  complexity  is  too  great  and  any 
hardware  implementation  is  unusable  with  the  design  proposed  in  Chapter  3. 
Recall  that  the  filtering  operations  are  performed  in  parallel  at  the  pixel  level. 

Other  methods  which  are  parallel  in  their  operation  (e.g.,  Batcher's 
odd -even  merge  sort  [2])  have  less  hardware  complexity  than  the  sequential 
methods,  require  far  fewer  control  instructions,  and  execute  with  a  time 
complexity  of  0(logN).  As  good  as  these  parallel  methods  may  be,  there  are  still 
fundamental  design  interface  issues  that  render  them  inappropriate.   First  of  all, 
each  of  the  methods  reviewed  required  the  numerical  string  to  be  gathered  in 
parallel  to  achieve  the   0(logN)  time  complexity.   Secondly,  the   0(logN) 
performance  requires  the  length  of  the  numerical  string  to  be  a  power  of  2. 

The  accumulation  processor  has  a  greatest  lower  bound  of  0(N)  with 
respect  to  how  it  accumulates  the  numerical  values.  The  values  are  received  one 
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at  a  time  due  to  the  shifting  operation  of  the  spatial  configuration  processor. 
Waiting  until  all  numerical  values  are  gathered  before  beginning  the  sorting 
operation,  as  well  as  restricting  the  length  of  the  string  to  a  power  of  2,  is 
impractical  for  the  proposed  system. 

A  new  method  of  sorting  is  introduced  in  the  next  section.  It  is  optimal 
with  respect  to  the  architecture  of  Chapter  3  in  that  the  cost  function  for 
computing  an  OS  filter  operation  is  essentially  the  same  as  any  other  generalized 
matrix  operation  (e.g.,  convolution). 

5.2.1  Concept  Description 

The  concept  of  a  stack  comparator  is  developed  around  the  notion  of  sorting 
the  numerical  values  in  parallel  as  they  are  received.   A  single  numerical  value  can 
be  simultaneously  compared  to  N  other  values  using  N  interconnected 
comparator  circuits.   Such  a  process  is  termed  flash  comparison.  Figure  26  shows 
the  gate  level  component  layout  and  truth  table  for  a  single  comparator  circuit 
[66]. 

By  arranging  the  interconnected  comparator  circuits  in  a  stack  structure, 
the  top  of  the  stack  can  be  consistently  used  as  the  single  numerical  value  that  is 
simultaneously  compared  to  all  previous  values  as  shown  in  Figure  27.   As  an 
incoming  value  is  compared  to  preceding  values,  the  number  of  times  the  top  value 
is  found  greater  than  or  equal  determines  its  rank-order. 

Individual  counters,  initialized  to  zero,  are  connected  to  the  greater  than 
and  equal  to  output  of  each  comparator  and  separately  for  the  top  of  the  stack 
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Figure  26.  A  comparator  circuit. 
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(refer  to  Figure  27).  As  a  numerical  value  is  pushed  onto  the  stack,  the  following 
two  control  conditions  are  automatically  invoked: 

Step  1:  The  counter  associated  with  the  top  of  the  stack  is  incremented. 

Step  2:  If  the  value  at  the  top  of  the  stack  is  less  than  the  jth  value  in  the 
stack,  then  increment  the  counter  associated  with  the  top  of  the  stack  (the  less 
than  output  of  the  jth  comparator  goes  high).   Otherwise,  the  counter  connected 
to  the  jth  value  is  incremented  (the  greater  than  or  the  equal  to  outputs  of  the 
jth  comparator  go  high). 

Allowing  for  a  signal  propagation  time  for  each  comparator  in  parallel  when 
a  value  is  pushed  onto  the  stack,  the  comparators'  counter  values  are  the  sorted  or 
rank-ordered,  thus   0(N).  Now  simply  select  the  numerical  value  whose 
respective  counter  equals  the  desired  OS  (ith  ranked -order).  Figure  28  illustrates 
the  data  flow  of  a  conceptual  stack  comparator. 

5.2.2  Proof  of  Concept 

The  proof  of  concept  is  conducted  using  the  rules  of  inference.   The  basic 
form  of  the  proof  is  similar  to  the  proof  for  an  if-then-else  condition  with  the 
condition  being  allowed  to  execute  in  parallel. 

Let: 

the  initial  assertion  p  be:  the  stack  comparator  counters  correctly 
rank-order  their  respective  numerical  values; 

the  condition  be:  true  if  the  top  of  the  stack  value  is  less  than  the  ith 
value,  and  false  otherwise; 

the  execution  SI  be:  increment  the  top  counter; 

the  execution  S2  be:  increment  the  ith  counter;  and, 
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the  final  assertion  q  be:  the  stack  comparator  counters  still  correctly 
rank— order  their  respective  numerical  values. 
The  proof  is  complete  if  it  can  be  shown  that 

(1)  when  p  is  true  and  condition  is  true,  then  q  is  true  after  Si 
terminates. 

(2)  when  p  is  true  and  condition  is  false,  then  q  is  true  after  S2 
terminates. 

Suppose  that  the  counters  correctly  rank -order  n  respective  values  on  an 
N  length  stack  comparator,  where  n  <  N. 

[p  is  initially  true]. 

Let  the  (n+l)th  value  to  be  pushed  onto  the  stack.  If  the  top  value  is  less  than 
m  values  and  greater  than  n— m  values,  where  0  <  m  <  n,  then  the  top  counter 
will  be  incremented  m  times  from  its  initial  count  of  1. 

[  (p  and  condition)  {Si}  in  parallel,  m  times  ]. 

The  top  counter  now  reads  m+1.  The  counters  corresponding  to  the  n— m  values 
are  individually  incremented  by  1. 

[  (p  and  not  condition)  {S2}  in  parallel,  n— m  times  ]. 


114 

The  state  of  the  stack  comparator  when  SI  and  S2  terminate  (for  all 
values  of  m)  is: 

m+1  the  counter  value  for  the  top  of  the  stack 

1  correctly  rank -ordered  values 

2  (numerical  values  and  counters  unchanged) 

• 

m 


(m+1^+1  correctly  rank-ordered  values 

(m+2)+l  (numerical  values  unchanged  — 

;  each  counter  incremented  by  1) 

(m+(n-m))+l  =  n+1 


The  entire  stack  comparator  is  still  correctly  rank-ordered,  thus  the  final  assertion 
q  is  true.   Formally, 

(p  and  condition)  {SI}  q 
(p  and  not  condition)  {S2}  q 


therefore         p  {if  condition  then  SI  else  S2}  q  is  true  by 
rule  of  inference. 

Q.E.D. 

5.2.3  Time  Complexity 

The  run  time  complexity  of  the  stack  comparator  sort  operation  just 
described  has  an  upper  bound  that  is  dependent  on  the  hardware  implementation 
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for  the  counter  associated  with  the  top  of  the  stack.  This  counter  is  connected  to 
the  less  than  output  of  each  comparator  in  the  stack. 

Theorem  3:  If  the  less  than  output  of  each  comparator  is  buffered  in  a 
sequential  fashion  to  the  counter  associated  with  the  top  of  the  stack,  then  the 
time  complexity  for  the  stack  comparator  is   0(N)  in  the  average  case. 

Proof:   Using  the  state  of  the  stack  definitions  from  the  proof  of  concept, 
Section  5.2.2,  suppose  the  counters  correctly  rank-order  n  respective  values  on  an 
N  length  stack  comparator,  where  n  <  N.  Let  the  n-th  value  be  pushed  onto 
the  stack.  If  the  top  value  is  less  than  m  values  and  greater  than  (n— 1)— m 
values,  where  0  <  m  <  n-1,  then  the  top  counter  will  be  incremented  m  times 
from  its  initial  count  of  1.   Since  the  m+1  count  is  gathered  sequentially,  the 
worst  case  scenario  is  when  m  =  n— 1  and  the  best  case  is  m  =  0. 

If  a  ripple  type  counter  is  used,  the  accumulative  number  of  clock  cycles 
used  to  sort  all  N  values  is  the  complexity  metric.  In  the  worst  case,  for  each 
value  being  pushed  onto  the  stack,  the  accumulative  number  of  clock  cycles  at  the 
top  counter  is 

1  +  2  +  3  +  •  •  •  +  N. 
The  accumulative  number  of  clock  cycles  in  the  best  case  is  simply  N.  Therefore, 
the  time  complexity  for  the  average  case  can  be  defined  as 

T(N)  =  (l  +  2  +  3  +  ...  +N)/N 

=  [N(N+l)/2]/N 

=  (N+l)/2  =  0(N). 

Q.E.D. 
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Theorem  4:  If  the  less  than  output  of  each  comparator  is  sent  directly  to  an 
encoder/counter  (i.e.,  in  a  parallel  fashion)  associated  with  the  top  of  the  stack, 
then  the  time  complexity  for  the  stack  comparator  is  8(N). 

PROOF:  Since  all  counters  are  incremented  in  parallel,  the  delta  time  for 
propagation  and  counter  update  of  each  comparator  in  the  stack  is  the  same 
constant,  r.     After  N  values  have  been  pushed  onto  the  stack,  the  run  time 
complexity  T(N)  is 

T(N)  =  r-N=  0(N), 

i.e.,  the  upper  bound. 

The  lower  bound  is  fl(N).  There  are  N  elements  to  be  sorted  which  are 
received  as  input  and/or  produced  as  output  at  the  rate  of  one  per  time  unit. 
Thus  fl(N)  time  units  are  required  by  any  parallel  sorting  algorithm,  regardless  of 
the  time  taken  by  the  actual  sorting  process  itself.   Because  the  upper  bound 
matches  the  lower  bound,  the  stack  comparator  sorting  circuit  (algorithm)  is  said 
to  be  optimal,  therefore,  fl(N). 

Q.E.D. 

5.3  System  Integration  and  Operation 

The  stack  comparator  concept  is  a  natural  extension  of  the  accumulation 
processor  described  in  Section  3.2  (refer  to  Figure  13).  The  ACCUM  register  (top 
of  the  stack)  and  registers  B  through  I  can  be  integrated  to  form  the  stack 
comparator,  and  under  proper  control,  retain  their  image  storage  (image  planes) 
intention. 
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The  operation  is  essentially  the  same  as  any  convolution  operation.  First, 
define  a  template  with  N  =  contemplate)  for  the  desired  window  of  the  OS 
filter  and  assign  the  value  0  everywhere  in  the  template  support.   Second,  make 
the  appropriate  procedure  call  with  the  desired  rank -order. 

The  spatial  configuration  processor  will  translate  the  image  relative  to  each 
location  in  the  support  of  the  template.  The  weighting  processor  will  pass  the 
image  pixel  value  to  the  accumulation  processor  (add  zero).  Each  translation  of 
the  image  will  therefore  accumulate  a  sorted  value  in  the  stack  comparator.  The 
rank-order  will  be  output  as  the  respective  pixel  value  in  the  filtered  image. 
Allowing  for  the  signal  propagation  through  the  circuits  of  the  comparators,  the 
OS  filtered  image  will  be  computed  at  essentially  the  same  throughput  as  the 
convolution. 


CHAPTER  6 
CONCLUSIONS  AND  SUGGESTIONS  FOR  FURTHER  RESEARCH 


The  electro-optical  image  algebra  processing  system  introduced  in  this 
dissertation  presents  a  feasible  solution  to  the  ATR  problem.  In  addition,  these 
initial  findings  set  the  stage  for  continued  development  of  this  parallel  architecture 
over  a  wide  range  of  image  processing  applications.  The  tight  coupling  of  image 
algebra  with  parallel  computation  is  fundamental  to  a  deeper  investigation  of 
parallel  computation  in  general. 

More  specifically,  it  has  been  shown  that: 

(1)  The  digital-optical  correlator  that  incorporates  the  arithmetic  logic 
operations  of  the  image  algebra  provides  a  robust  target  acquisition  and 
identification  system.  The  optical  correlator  alone  provides  only  the  identification 
of  targets  that  are  stored  in  its  matched  filters.  The  correlator  matched  filter 
operation  is  variant  with  respect  to  the  range  and  aspect  to  a  given  target.  As  a 
result,  a  series  of  composite  matched  filters  are  used  (cycled)  to  make  the 
correlator  operation  invariant  over  a  finite  operational  volume.  The  identification 
procedure  can  only  take  place  inside  this  operational  volume.  While  the  system  is 
outside  this  operational  volume,  lower  level  image  processing  algorithms,  expressed 
in  the  algebra,  are  used  to  provide  the  acquisition;  i.e.,  the  guidance  update 
information  based  on  target  classification.   Robustness  is  inferred  by  the  overall 
system  operation  and  efficiency.   Coupling  the  two  main  functions,  acquisition  and 
identification,  into  a  single  compact  system  provides  a  wide  range  of  operation 
with  lower  weight  and  volume.  The  gain  in  efficiency  is  made  by  using  the 
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capability  of  the  image  algebra  to  preprocess  the  input  information  and  selectively 
reduce,  by  target  classification,  the  matched  filter  cycle  selection. 

(2)  The  generalized  image-template  product  of  the  image  algebra  is 
equivalent  to  a  sequence  of  linear  convolutions  combined  by  the  respective  global 
reduce  operation  (Equations  13,  14,  and  25).   Since  optical  systems  are  linear,  this 
is  an  important  observation  when  considering  future  all  optical  or  electro-optical 
implementations  of  algebraic  operations. 

(3)  When  the  input  to  the  correlator  is  restricted  to  binary  imagery,  the 
proposed  electro -optical  implementation  using  the  image  algebra  developed  by 
Ritter  and  his  colleagues  is  more  powerful  than  the  homogeneous  binary  image 
algebra/DOCIP  combination.  More  powerful  is  defined  here  as  providing  more 
image  processing  operations  at  a  lower  time  complexity  metric. 

(4)  The  algebraic  constructs  presented  in  (2)  are  applicable  to  high 
resolution  holographic  film  (Vander  Lugt  system)  operations.    These  systems  do 
not  have  the  critical  response  time  that  is  inherent  in  the  ATR  application  and  can 
be  implemented  with  lower  hardware  complexity,  especially  convolution 
operations.  This  offers  another  wide  area  of  research. 

(5)  The  proposed  electro-optical  design  can  incorporate  a  fast  OS  filtering 
application  for  two-dimensional  imagery.  Its  implementation  has  wide  use  in  the 
ATR  environment.  The  speed  of  the  method  is  developed  around  a  unique  circuit 
designed  to  sort  N  sequentially  gathered  numerical  values  with  an  optimal  run 
time  complexity  of  fl(N).  Two  other  closely  related  methods  of  OS  filtering,  stack 
filters  and  morphological  algebra,  are  determined  to  be  very  limited  with  regard  to 
the  number  of  window  elements  that  can  be  used  in  the  OS  filtering  process  and 
not  practical  for  two-dimensional  imagery.  The  new  method  does  not  have  this 
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restriction.  Also,  the  new  OS  filtering  method  operates  at  or  near  the  same  cost 
function  as  a  single  convolution  with  the  same  template  configuration. 

While  preparing  this  dissertation  many  additional  technical  issues  have 
arisen.  The  following  discussion  addresses  some  of  these  as  additional  areas  for 
research.  The  author  intends  to  continue  this  specific  area  of  research  to  a 
hardware  development  end.   Any  results  obtained  that  relate  to  the  following 
would  be  appreciated. 

Recently,  Bell  Research  announced  the  development  of  a  fast  gray  level 
spatial  light  modulator  capable  of  submicrosecond  switching  [54].  The  design  is 
based  on  FLC  smectic  C  material  in  a  variable  twisted  layer  configuration.  The 
announcement  confirms  the  assertion  that  the  industry  is  moving  in  a  direction 
which  supports  the  developmental  element  of  the  proposed  correlator  design.  The 
immediate  question  is  whether  or  not  the  smectic  A  material  which  is  capable  of 
much  higher  switching  speeds  can  be  incorporated?  If  so,  can  the  A  material  in 
the  variable  twisted  layer  configuration  provide  more  than  8  bits  of  resolution? 

A  faster  I/O  is  needed  to  address  the  spatial  light  modulator.   Can  optically 
addressed  storage  methods  such  as  lithium  niobate  (LiNb03)  crystal  and 
holograms  be  used  with  the  proposed  design?  If  so,  the  method  of  composite 
matched  filter  storage  and  retrieval  can  not  only  increase  the  operational  volume 
for  an  individual  target,  but  the  number  of  targets  as  well. 

The  mathematical  concepts  used  to  develop  the  parallel  computational 
method  for  the  specific  design  in  this  dissertation  can  be  established  for  a  more 
general  use,  coprocessor  design.  The  fundamental  problem,  one  that  is  shared  by 
all  SIMD  type  processors,  is  the  I/O  time  complexity  for  each  processing  element 
(PE).  This  problem  is  one  that  continues  as  a  current  research  topic 
[9,16,35,45,70].   One  developmental  I/O  method  that  seems  ideally  suited  for  the 
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shifting  processor  of  Section  3.2  is  the  electro-optical  crossbar  described  by  Kumar 
et  al.  [24,47].  This  hybrid  reconfiguration  technique  is  used  for  the  free  space 
interconnecting  of  PEs.  The  gain  from  using  this  technique  is  that  the  image 
translation  operation  performed  by  the  shifting  processor  would  be  executed  in 
terms  of  nanoseconds.  Thus,  depending  on  the  execution  time  for  the  rest  of  the 
pipeline,  Section  3.2,  the  total  throughput  would  easily  be  the  equivalent  of  a 
uniprocessor  running  at  teraops  (1012  operations  per  second).  This  would  make  an 
ideal  image  algebra  accelerator  for  the  algorithm  development  environment  being 
developed  for  the  Air  Force  [60]. 

Another  interesting  research  proposal  is  the  application  of  the 
mathematical  concepts  formulated  in  this  dissertation  to  existing  computational 
architectures.  Three  architectures,  certainly  there  are  others,  that  have 
convenient  PE  communication  pathways  are:  the  MIMD  hypercube  [70],  the 
geometric  arithmetic  parallel  processor  [17],  and  the  advanced  systolic  array 
processor  [34].  The  MIMD  hypercube  has  the  most  potential  for  efficient 
execution,  especially  the  variant  template  operations  discussed  in  Section  3.6. 
Figure  29  illustrates  a  simple  addressing  scheme  for  executing  a  generalized  matrix 
product  operation  of  an  image  with  a  3-element  template  using  a  16  PE  hypercube 
(viewed  as  a  row  major  4«4  array). 

Finally,  the  evaluation  of  OS  filter  operations  using  the  stack  comparator 
concept  defined  in  Chapter  5  is  a  research  topic  for  both  image  processing  and 
DSP.  The  OS  filters  that  preserve  median  roots  is  an  important  aspect  of 
one-dimensional  signal  shaping.   However,  as  shown  by  Maragos,  Schafer,  and 
Wendt  et  al.  [52,73],  the  number  of  filters  grows  combinatorially  large  as  the 
number  of  filter  window  elements  is  increased.  The  relevance  and  the  selection  of 
these  filters  (templates)  with  regard  to  two-dimensional  image  processing  is 
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Input  image  a  loaded  in  0000  through  1111.     Let  template  t(n)  = 
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Figure  29.  A  generalized  image-template  product  hypercube  execution. 
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Figure  30.  Stack  comparator  implementation  of  asymmetric  median  filter. 
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unknown.  Their  implementation  using  the  stack  comparator  is  trivial.  Referring 
to  the  definition  of  terms  for  Equation  35,  any  necessary  weighting  of  the  rank 
order  (used  to  preserve  the  root)  is  accomplished  in  the  stack  comparator  by 
simply  clocking  the  respective  window  translation  position  the  number  of  times 
that  matches  the  desired  weighting.  Figure  30  compares  this  procedure  for  the 
asymmetric  median  filter  example  from  Figure  24.   Note  that  the  example 
considers  the  signal  to  be  time  invariant;  it  does,  however,  illustrate  the  shifting 
principle  that  would  be  employed  for  the  two-dimensional  case.  For  time  variant 
signal  processing,  the  stack  comparator  can  be  easily  reconfigured  as  a  queue  and 
capable  of  real-time  DSP  without  the  combinatorial  growth  in  circuitry. 
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